Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    Online Resource
    Online Resource
    Angle Publishing Co., Ltd. ; 2023
    In:  電腦學刊 Vol. 34, No. 3 ( 2023-06), p. 075-091
    In: 電腦學刊, Angle Publishing Co., Ltd., Vol. 34, No. 3 ( 2023-06), p. 075-091
    Abstract: 〈p〉The feature extraction ability of lightweight convolutional neural networks in speaker recognition systems is weak. And recognition accuracy is poor. Many methods use deeper, wider, and more complex network structures to improve the feature extraction ability. But it makes the parameters and inference time increase exponentially. In the paper, we introduce Res2Net in target detection task to speaker recognition task and verify its effectiveness and robustness in the speaker recognition task. And we improved and proposed FullRes2Net. It has better multi-scale feature extraction ability without increasing the number of parameters. Then, we proposed the mixed time-frequency channel attention to solve the problems of existing attention methods to improve the shortcomings of convolution itself and further enhance the feature extraction ability of convolutional neural networks. Experiments were conducted on the Voxceleb dataset. The results show that the MTFC-FullRes2Net end-to-end speaker recognition system proposed in this paper effectively improves the feature extraction and generalization ability of the Res2Net. Compared to Res2Net, MTFC-FullRes2Net performance improves by 31.5%. And Compared to ThinResNet-50, RawNet, CNN+Transformer and Y-vector, MTFC-FullRes2Net performance is improved by 56.5%, 14.1%, 16.7% and 23.4%, respectively. And it is superior to state-of-the-art speaker recognition systems that use complex structures. It is a lightweight and more efficient end-to-end architecture and is also more suitable for practical application.〈/p〉 〈p〉 〈/p〉
    Type of Medium: Online Resource
    ISSN: 1991-1599 , 1991-1599
    Uniform Title: End-to-end Speaker Recognition Based on MTFC-FullRes2Net
    Language: Unknown
    Publisher: Angle Publishing Co., Ltd.
    Publication Date: 2023
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. Further information can be found on the KOBV privacy pages