Multi-Task Learning with High-Order Statistics for X-vector based Text-Independent Speaker Verification

被引:8
|
作者
You, Lanhua [1 ]
Guo, Wu [1 ]
Dai, Li-Rong [1 ]
Du, Jun [1 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Speaker verification; High-order statistics; X-vector; Multi-task learning; Unsupervised learning;
D O I
10.21437/Interspeech.2019-2264
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
The x-vector based deep neural network (DNN) embedding systems have demonstrated effectiveness for text-independent speaker verification. This paper presents a multi-task learning architecture for training the speaker embedding DNN with the primary task of classifying the target speakers, and the auxiliary task of reconstructing the first- and higher-order statistics of the original input utterance. The proposed training strategy aggregates both the supervised and unsupervised learning into one framework to make the speaker embeddings more discriminative and robust. Experiments are carried out using the NIST SRE16 evaluation dataset and the VOiCES dataset. The results demonstrate that our proposed method outperforms the original x-vector approach with very low additional complexity added.
引用
收藏
页码:1158 / 1162
页数:5
相关论文
共 50 条
  • [1] Linear transformation on x-vector for text-independent speaker verification
    Xu, Longting
    Ren, Bo
    Zhang, Guanglin
    Yang, Jichen
    [J]. ELECTRONICS LETTERS, 2019, 55 (15) : 864 - 865
  • [2] An Adaptive X-vector Model for Text-independent Speaker Verification
    Gu, Bin
    Guo, Wu
    Ding, Penguin
    Ling, Zhenhua
    Du, Jun
    [J]. INTERSPEECH 2020, 2020, : 1506 - 1510
  • [3] Multi-task learning for X-vector based speaker recognition
    Zhang Y.
    Liu L.
    [J]. International Journal of Speech Technology, 2023, 26 (04) : 817 - 823
  • [4] Influence of task duration in text-independent speaker verification
    Fauve, Benoit
    Evans, Nicholas
    Pearson, Neil
    Bonastre, Jean-Francois
    Mason, John
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2728 - +
  • [5] Text-Independent Speaker Verification Based on Information Theoretic Learning
    Memon, Sheeraz
    Khanzada, Tariq Jameel Saifullah
    Bhatti, Sania
    [J]. MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2011, 30 (03) : 457 - 468
  • [6] Local Variability Vector for Text-Independent Speaker Verification
    Chen, Liping
    Lee, Kong Aik
    Ma, Bin
    Guo, Wu
    Li, Haizhou
    Dai, Li Rong
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 54 - +
  • [7] Deep multi-metric learning for text-independent speaker verification
    Xu, Jiwei
    Wang, Xinggang
    Feng, Bin
    Liu, Wenyu
    [J]. NEUROCOMPUTING, 2020, 410 (410) : 394 - 400
  • [8] Multi-Task Learning for Text-dependent Speaker Verification
    Chen, Nanxin
    Qian, Yanmin
    Yu, Kai
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 185 - 189
  • [9] Deep Speaker Feature Learning for Text-independent Speaker Verification
    Li, Lantian
    Chen, Yixiang
    Shi, Zing
    Tang, Zhiyuan
    Wang, Dong
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1542 - 1546
  • [10] Vector-Based Attentive Pooling for Text-Independent Speaker Verification
    Wu, Yanfeng
    Guo, Chenkai
    Gao, Hongcan
    Hou, Xiaolei
    Xu, Jing
    [J]. INTERSPEECH 2020, 2020, : 936 - 940