JOINT I-VECTOR WITH END-TO-END SYSTEM FOR SHORT DURATION TEXT-INDEPENDENT SPEAKER VERIFICATION

被引:0
|
作者
Huang, Zili [1 ]
Wang, Shuai [1 ]
Qian, Yanmin [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Brain Sci & Technol Res Ctr, Key Lab Shanghai Educ Commiss Intelligent Interac, Speech Lab,Dept Comp Sci & Engn, Shanghai, Peoples R China
[2] Tencent, Tencent AI Lab, Bellevue, WA 98004 USA
关键词
speaker verification; end-to-end; i-vector; triplet loss; hard trial selection; EMBEDDINGS;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Factor analysis based i-vector has been the state-of-the-art method for speaker verification. Recently, researchers propose to build DNN based end-to-end speaker verification systems and achieve comparable performance with i-vector. Since these two methods possess their own property and differ from each other significantly, we explore a framework to integrate these two paradigms together to utilize their complementarity. More specifically, in this paper we develop and compare four methodologies to integrate traditional i-vector into end-to-end systems, including score fusion, embeddings concatenation, transformed concatenation and joint learning. All these approaches achieve significant gains. Moreover, the hard trial selection is performed on the end-to-end architecture which further improves the performance. Experimental results on a text-independent short-duration dataset generated from SRE 2010 reveal that the newly proposed method reduces the EER by relative 31.0% and 28.2% compared to the i-vector and end-to-end baselines respectively.
引用
收藏
页码:4869 / 4873
页数:5
相关论文
共 50 条
  • [1] END-TO-END TEXT-INDEPENDENT SPEAKER VERIFICATION WITH FLEXIBILITY IN UTTERANCE DURATION
    Zhang, Chunlei
    Koishida, Kazuhito
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 584 - 590
  • [2] Strategies for End-to-End Text-Independent Speaker Verification
    Lin, Weiwei
    Mak, Man-Wai
    Chien, Jen-Tzung
    INTERSPEECH 2020, 2020, : 4308 - 4312
  • [3] End-to-End Text-Independent Speaker Verification with Triplet Loss on Short Utterances
    Zhang, Chunlei
    Koishida, Kazuhito
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1487 - 1491
  • [4] An End-to-End Text-Independent Speaker Identification System on Short Utterances
    Ji, Ruifang
    Cai, Xinyuan
    Xu, Bo
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3628 - 3632
  • [5] End-to-End Feature Learning for Text-Independent Speaker Verification
    Chen, Fangzhou
    Bian, Tengyue
    Xu, Li
    PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 3949 - 3954
  • [6] Weighted I-Vector Based Text-Independent Speaker Verification System
    Mohammadi, Mohsen
    Mohammadi, Hamid Reza Sadegh
    2019 27TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE 2019), 2019, : 1647 - 1653
  • [7] An End-to-End Text-independent Speaker Verification Framework with a Keyword Adversarial Network
    Yun, Sungrack
    Cho, Janghoon
    Eum, Jungyun
    Chang, Wonil
    Hwang, Kyuwoong
    INTERSPEECH 2019, 2019, : 2923 - 2927
  • [8] DEEP BOTTLENECK FEATURES FOR I-VECTOR BASED TEXT-INDEPENDENT SPEAKER VERIFICATION
    Ghalehjegh, Sina Hamidi
    Rose, Richard C.
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 555 - 560
  • [9] Avoiding Speaker Overfitting in End-to-End DNNs using Raw Waveform for Text-Independent Speaker Verification
    Jung, Jee-Weon
    Heo, Hee-Soo
    Yang, Il-Ho
    Shim, Hye-Jin
    Yu, Ha-Jin
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3583 - 3587
  • [10] I-vector Based Text-Independent Speaker Identification
    Liu, Tingting
    Kang, Kai
    Guan, Shengxiao
    2014 11TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2014, : 5420 - 5425