End-to-End Text-Dependent Speaker Verification

被引:0
|
作者
Heigold, Georg [1 ,2 ,4 ]
Moreno, Ignacio [3 ]
Bengio, Samy [3 ]
Shazeer, Noam [3 ]
机构
[1] Univ Saarland, D-66123 Saarbrucken, Germany
[2] DFKI, Berlin, Germany
[3] Google Inc, Mountain View, CA USA
[4] Google, Mountain View, CA USA
关键词
speaker verification; end-to-end training; deep learning;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we present a data-driven, integrated approach to speaker verification, which maps a test utterance and a few reference utterances directly to a single score for verification and jointly optimizes the system's components using the same evaluation protocol and metric as at test time. Such an approach will result in simple and efficient systems, requiring little domain-specific knowledge and making few model assumptions. We implement the idea by formulating the problem as a single neural network architecture, including the estimation of a speaker model on only a few utterances, and evaluate it on our internal "Ok Google" benchmark for text-dependent speaker verification. The proposed approach appears to be very effective for big data applications like ours that require highly accurate, easy-to-maintain systems with a small footprint.
引用
收藏
页码:5115 / 5119
页数:5
相关论文
共 50 条
  • [1] END-TO-END ATTENTION BASED TEXT-DEPENDENT SPEAKER VERIFICATION
    Zhang, Shi-Xiong
    Chen, Zhuo
    Zhao, Yong
    Li, Jinyu
    Gong, Yifan
    [J]. 2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 171 - 178
  • [2] End-to-end text-dependent speaker verification using novel distance measures
    Dey, Subhadeep
    Madikeri, Srikanth
    Motlicek, Petr
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3598 - 3602
  • [3] Joint Training of Expanded End-to-end DNN for Text-dependent Speaker Verification
    Heo, Hee-soo
    Jung, Jee-weon
    Yang, Il-ho
    Yoon, Sung-hyun
    Yu, Ha-jin
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1532 - 1536
  • [4] aDCF Loss Function for Deep Metric Learning in End-to-End Text-Dependent Speaker Verification Systems
    Mingote, Victoria
    Miguel, Antonio
    Ribas, Dayana
    Ortega, Alfonso
    Lleida, Eduardo
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 772 - 784
  • [5] Optimization of False Acceptance/Rejection Rates and Decision Threshold for End-to-End Text-Dependent Speaker Verification Systems
    Mingote, Victoria
    Miguel, Antonio
    Ribas, Dayana
    Ortega, Alfonso
    Lleida, Eduardo
    [J]. INTERSPEECH 2019, 2019, : 2903 - 2907
  • [6] Strategies for End-to-End Text-Independent Speaker Verification
    Lin, Weiwei
    Mak, Man-Wai
    Chien, Jen-Tzung
    [J]. INTERSPEECH 2020, 2020, : 4308 - 4312
  • [7] Far-Field End-to-End Text-Dependent Speaker Verification based on Mixed Training Data with Transfer Learning and Enrollment Data Augmentation
    Qin, Xiaoyi
    Cai, Danwei
    Li, Ming
    [J]. INTERSPEECH 2019, 2019, : 4045 - 4049
  • [8] Text-dependent speaker verification system
    Qin, Bing
    Chen, Huipeng
    Li, Guangqi
    Liu, Songbo
    [J]. Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2000, 32 (04): : 16 - 18
  • [9] End-to-End Feature Learning for Text-Independent Speaker Verification
    Chen, Fangzhou
    Bian, Tengyue
    Xu, Li
    [J]. PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 3949 - 3954
  • [10] End Point Detection Using Speech-Specific Knowledge for Text-Dependent Speaker Verification
    Bhukya, Ramesh K.
    Sarma, Biswajit Dev
    Prasanna, S. R. Mahadeva
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2018, 37 (12) : 5507 - 5539