Improving X-vector and PLDA for Text-dependent Speaker Verification

被引:5
|
作者
Chen, Zhuxin [1 ]
Lin, Yue [1 ]
机构
[1] NetEase Games AI Lab, Hangzhou, Peoples R China
来源
关键词
speech verification; x-vector; PLDA; SDSVC; 2020; short utterance;
D O I
10.21437/Interspeech.2020-1188
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Recently, the pipeline consisting of an x-vector speaker embedding front-end and a Probabilistic Linear Discriminant Analysis (PLDA) back-end has achieved state-of-the-art results in text-independent speaker verification. In this paper, we further improve the performance of x-vector and PLDA based system for text-dependent speaker verification by exploring the choice of layer to produce embedding and modifying the back-end training strategies. In particular, we probe that x-vector based embeddings, specifically the standard deviation statistics in the pooling layer, contain the information related to both speaker characteristics and spoken content. Accordingly, we modify the back-end training labels by utilizing both of the speaker-id and phrase-id. A correlation-alignment-based PLDA adaptation is also adopted to make use of the text-independent labeled data during back-end training. Experimental results on the SDSVC 2020 dataset show that our proposed methods achieve significant performance improvement compared with the x-vector and HMM based i-vector baselines.
引用
收藏
页码:726 / 730
页数:5
相关论文
共 50 条
  • [21] A Survey on Text-Dependent and Text-Independent Speaker Verification
    Tu, Youzhi
    Lin, Weiwei
    Mak, Man-Wai
    [J]. IEEE ACCESS, 2022, 10 : 99038 - 99049
  • [22] Analysis of the Hilbert Spectrum for Text-Dependent Speaker Verification
    Sharma, Rajib
    Bhukya, Ramesh K.
    Prasanna, S. R. M.
    [J]. SPEECH COMMUNICATION, 2018, 96 : 207 - 224
  • [23] Towards Goat Detection in Text-Dependent Speaker Verification
    Toledo-Ronen, Orith
    Aronowitz, Hagai
    Hoory, Ron
    Pelecanos, Jason
    Nahamoo, David
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 16 - +
  • [24] Deep Embedding Learning for Text-Dependent Speaker Verification
    Zhang, Peng
    Hu, Peng
    Zhang, Xueliang
    [J]. INTERSPEECH 2020, 2020, : 3461 - 3465
  • [25] Tandem Deep Features for Text-Dependent Speaker Verification
    Fu, Tianfan
    Qian, Yanmin
    Liu, Yuan
    Yu, Kai
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1327 - 1331
  • [26] EXPLOITING SEQUENCE INFORMATION FOR TEXT-DEPENDENT SPEAKER VERIFICATION
    Dey, Subhadeep
    Motlicek, Petr
    Madikeri, Srikanth
    Ferras, Marc
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5370 - 5374
  • [27] Data Augmentation Enhanced Speaker Enrollment for Text-dependent Speaker Verification
    Sarkar, Achintya Kumar
    Sarma, Himangshu
    Dwivedi, Priyanka
    Tan, Zheng-Hua
    [J]. 2020 3RD INTERNATIONAL CONFERENCE ON ENERGY, POWER AND ENVIRONMENT: TOWARDS CLEAN ENERGY TECHNOLOGIES (ICEPE 2020), 2021,
  • [28] Template-matching for text-dependent speaker verification
    Dey, Subhadeep
    Motlicek, Petr
    Madikeri, Srikanth
    Ferras, Marc
    [J]. SPEECH COMMUNICATION, 2017, 88 : 96 - 105
  • [29] On Residual CNN in Text-Dependent Speaker Verification Task
    Malykh, Egor
    Novoselov, Sergey
    Kudashev, Oleg
    [J]. SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 593 - 601
  • [30] MODELLING THE ALTERNATIVE HYPOTHESIS FOR TEXT-DEPENDENT SPEAKER VERIFICATION
    Larcher, Anthony
    Lee, Kong Aik
    Ma, Bin
    Li, Haizhou
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,