End-to-End Text-Dependent Speaker Verification

被引:0
|
作者
Heigold, Georg [1 ,2 ,4 ]
Moreno, Ignacio [3 ]
Bengio, Samy [3 ]
Shazeer, Noam [3 ]
机构
[1] Univ Saarland, D-66123 Saarbrucken, Germany
[2] DFKI, Berlin, Germany
[3] Google Inc, Mountain View, CA USA
[4] Google, Mountain View, CA USA
关键词
speaker verification; end-to-end training; deep learning;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we present a data-driven, integrated approach to speaker verification, which maps a test utterance and a few reference utterances directly to a single score for verification and jointly optimizes the system's components using the same evaluation protocol and metric as at test time. Such an approach will result in simple and efficient systems, requiring little domain-specific knowledge and making few model assumptions. We implement the idea by formulating the problem as a single neural network architecture, including the estimation of a speaker model on only a few utterances, and evaluate it on our internal "Ok Google" benchmark for text-dependent speaker verification. The proposed approach appears to be very effective for big data applications like ours that require highly accurate, easy-to-maintain systems with a small footprint.
引用
收藏
页码:5115 / 5119
页数:5
相关论文
共 50 条
  • [31] Data Augmentation Enhanced Speaker Enrollment for Text-dependent Speaker Verification
    Sarkar, Achintya Kumar
    Sarma, Himangshu
    Dwivedi, Priyanka
    Tan, Zheng-Hua
    [J]. 2020 3RD INTERNATIONAL CONFERENCE ON ENERGY, POWER AND ENVIRONMENT: TOWARDS CLEAN ENERGY TECHNOLOGIES (ICEPE 2020), 2021,
  • [32] Template-matching for text-dependent speaker verification
    Dey, Subhadeep
    Motlicek, Petr
    Madikeri, Srikanth
    Ferras, Marc
    [J]. SPEECH COMMUNICATION, 2017, 88 : 96 - 105
  • [33] On Residual CNN in Text-Dependent Speaker Verification Task
    Malykh, Egor
    Novoselov, Sergey
    Kudashev, Oleg
    [J]. SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 593 - 601
  • [34] MODELLING THE ALTERNATIVE HYPOTHESIS FOR TEXT-DEPENDENT SPEAKER VERIFICATION
    Larcher, Anthony
    Lee, Kong Aik
    Ma, Bin
    Li, Haizhou
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [35] Effective Phase Encoding for End-to-end Speaker Verification
    Peng, Junyi
    Qu, Xiaoyang
    Gu, Rongzhi
    Wang, Jianzong
    Xiao, Jing
    Burget, Lukas
    Cernocky, Jan ''Honza''
    [J]. INTERSPEECH 2021, 2021, : 2366 - 2370
  • [36] Generalized End-to-End Loss for Forensic Speaker Verification
    Huapeng WANG
    Fangzhou HE
    Lianquan WU
    [J]. Journal of Systems Science and Information, 2023, 11 (02) : 264 - 276
  • [37] Contrastive Learning for improving End-to-end Speaker Verification
    Tang, Yanxi
    Wang, Jianzong
    Qu, Xiaoyang
    Xiao, Jing
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [38] Angular Softmax Loss for End-to-end Speaker Verification
    Li, Yutian
    Gao, Feng
    Ou, Zhijian
    Sun, Jiasong
    [J]. 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 190 - 194
  • [39] Robust End-to-End Speaker Verification Using EEG
    Han, Yan
    Krishna, Gautam
    Tran, Co
    Carnahan, Mason
    Tewfik, Ahmed H.
    [J]. 28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 1170 - 1174
  • [40] Constrained temporal structure for text-dependent speaker verification
    Larcher, Anthony
    Bonastre, Jean-Francois
    Mason, John S. D.
    [J]. DIGITAL SIGNAL PROCESSING, 2013, 23 (06) : 1910 - 1917