Deep feature for text-dependent speaker verification

被引:140
|
作者
Liu, Yuan [1 ]
Qian, Yanmin [1 ]
Chen, Nanxin [1 ]
Fu, Tianfan [1 ]
Zhang, Ya [2 ]
Yu, Kai [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Key Lab Shanghai Educ Commiss Intelligent Interac, Shanghai 200030, Peoples R China
[2] Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai 200030, Peoples R China
关键词
Text-dependent speaker verification; Deep neural networks; Deep features; RSR2015; HIDDEN MARKOV-MODELS; NEURAL-NETWORKS; MACHINES;
D O I
10.1016/j.specom.2015.07.003
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recently deep learning has been successfully used in speech recognition, however it has not been carefully explored and widely accepted for speaker verification. To incorporate deep learning into speaker verification, this paper proposes novel approaches of extracting and using features from deep learning models for text-dependent speaker verification. In contrast to the traditional short-term spectral feature, such as MFCC or PLP, in this paper, outputs from hidden layer of various deep models are employed as deep features for text-dependent speaker verification. Fours types of deep models are investigated: deep Restricted Boltzmann Machines, speech-discriminant Deep Neural Network (DNN), speaker-discriminant DNN, and multi-task joint-learned DNN. Once deep features are extracted, they may be used within either the GMM-UBM framework or the identity vector (i-vector) framework. Joint linear discriminant analysis and probabilistic linear discriminant analysis are proposed as effective back-end classifiers for identity vector based deep features. These approaches were evaluated on the RSR2015 data corpus. Experiments showed that deep feature based methods can obtain significant performance improvements compared to the traditional baselines, no matter if they are directly applied in the GMM-UBM system or utilized as identity vectors. The EER of the best system using the proposed identity vector is 0.10%, only one fifteenth of that in the GMM-UBM baseline. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 13
页数:13
相关论文
共 50 条
  • [41] Text-dependent speaker verification: Classifiers, databases and RSR2015
    Larcher, Anthony
    Lee, Kong Aik
    Ma, Bin
    Li, Haizhou
    [J]. SPEECH COMMUNICATION, 2014, 60 : 56 - 77
  • [42] Improving X-vector and PLDA for Text-dependent Speaker Verification
    Chen, Zhuxin
    Lin, Yue
    [J]. INTERSPEECH 2020, 2020, : 726 - 730
  • [43] DNN BASED SPEAKER EMBEDDING USING CONTENT INFORMATION FOR TEXT-DEPENDENT SPEAKER VERIFICATION
    Dey, Subhadeep
    Koshinaka, Takafumi
    Motlicek, Petr
    Madikeri, Srikanth
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5344 - 5348
  • [44] END-TO-END ATTENTION BASED TEXT-DEPENDENT SPEAKER VERIFICATION
    Zhang, Shi-Xiong
    Chen, Zhuo
    Zhao, Yong
    Li, Jinyu
    Gong, Yifan
    [J]. 2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 171 - 178
  • [45] Text-dependent Speaker Verification Using Word-based Scoring
    Yao, Shengyu
    Huang, Houjun
    Zhou, Ruohua
    Yan, Yonghong
    [J]. 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 314 - 318
  • [46] Lexicon-Based Local Representation for Text-Dependent Speaker Verification
    You, Hanxu
    Li, Wei
    Li, Lianqiang
    Zhu, Jie
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (03): : 587 - 589
  • [47] COMPARISON OF MULTIPLE FEATURES AND MODELING METHODS FOR TEXT-DEPENDENT SPEAKER VERIFICATION
    Liu, Yi
    He, Liang
    Tian, Yao
    Chen, Zhuzi
    Liu, Jia
    Johnson, Michael T.
    [J]. 2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 629 - 636
  • [48] On the study of replay and voice conversion attacks to text-dependent speaker verification
    Zhizheng Wu
    Haizhou Li
    [J]. Multimedia Tools and Applications, 2016, 75 : 5311 - 5327
  • [49] On the study of replay and voice conversion attacks to text-dependent speaker verification
    Wu, Zhizheng
    Li, Haizhou
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (09) : 5311 - 5327
  • [50] APPLICATION OF DYNAMIC TIME WARPING AND CEPSTROGRAMS TO TEXT-DEPENDENT SPEAKER VERIFICATION
    Kaczmarek, Andrzej
    Staworko, Michal
    [J]. SPA 2009: SIGNAL PROCESSING ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS CONFERENCE PROCEEDINGS, 2009, : 169 - +