DEEP NEURAL NETWORK BASED POSTERIORS FOR TEXT-DEPENDENT SPEAKER VERIFICATION

被引:0
|
作者
Dey, Subhadeep [1 ,2 ]
Madikeri, Srikanth [1 ]
Ferras, Marc [1 ]
Modicek, Petr [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
关键词
Text-dependent speaker verification; DNN posterior; Dynamic Time Warping;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The i-vector and Joint Factor Analysis (JFA) systems for text-dependent speaker verification use sufficient statistics computed from a speech utterance to estimate speaker models. These statistics average the acoustic information over the utterance thereby losing all the sequence information. In this paper, we study explicit content matching using Dynamic Time Warping (DTW) and present the best achievable error rates for speaker-dependent and speaker-independent content matching. For this purpose, a Deep Neural Network/Hidden Markov Model Automatic Speech Recognition (DNN/HMM ASR) system is used to extract content-related posterior probabilities. This approach outperforms systems using Gaussian mixture model posteriors by at least 50% Equal Error Rate (EER) on the RSR2015 in content mismatch trials. DNN posteriors are also used in i-vector and JFA systems, obtaining EERs as low as 0.02%.
引用
收藏
页码:5050 / 5054
页数:5
相关论文
共 50 条
  • [1] Deep feature for text-dependent speaker verification
    Liu, Yuan
    Qian, Yanmin
    Chen, Nanxin
    Fu, Tianfan
    Zhang, Ya
    Yu, Kai
    [J]. SPEECH COMMUNICATION, 2015, 73 : 1 - 13
  • [2] DEEP NEURAL NETWORKS FOR SMALL FOOTPRINT TEXT-DEPENDENT SPEAKER VERIFICATION
    Variani, Ehsan
    Lei, Xin
    McDermott, Erik
    Moreno, Ignacio Lopez
    Gonzalez-Dominguez, Javier
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [3] Covariance Based Deep Feature for Text-Dependent Speaker Verification
    Wang, Shuai
    Dinkel, Heinrich
    Qian, Yanmin
    Yu, Kai
    [J]. INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING, 2018, 11266 : 231 - 242
  • [4] Deep Embedding Learning for Text-Dependent Speaker Verification
    Zhang, Peng
    Hu, Peng
    Zhang, Xueliang
    [J]. INTERSPEECH 2020, 2020, : 3461 - 3465
  • [5] Tandem Deep Features for Text-Dependent Speaker Verification
    Fu, Tianfan
    Qian, Yanmin
    Liu, Yuan
    Yu, Kai
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1327 - 1331
  • [6] Text-dependent speaker verification system
    Qin, Bing
    Chen, Huipeng
    Li, Guangqi
    Liu, Songbo
    [J]. Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2000, 32 (04): : 16 - 18
  • [7] Text-dependent speaker verification using genetic algorithm and competitive learning neural network
    Cho, Seongwon
    Kim, Jaemin
    Kim, Daehwan
    Kang, Jiwoon
    Lee, Jinhyung
    Kim, Hunki
    Kim, Seokho
    Oh, Dusik
    Jeon, Seoungseon
    Chung, Sun-Tae
    [J]. PROCEEDINGS OF THE FOURTH IASTED INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PATTERN RECOGNITION, AND APPLICATIONS, 2007, : 293 - +
  • [8] Text-Dependent Speaker Verification System: A Review
    Debnath, Saswati
    Soni, B.
    Baruah, U.
    Sah, D. K.
    [J]. PROCEEDINGS OF 2015 IEEE 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO), 2015,
  • [9] ATTENTION-BASED MODELS FOR TEXT-DEPENDENT SPEAKER VERIFICATION
    Chowdhury, F. A. Rezaur Rahman
    Wang, Quan
    Moreno, Ignacio Lopez
    Wan, Li
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5359 - 5363
  • [10] Bidirectional Attention for Text-Dependent Speaker Verification
    Fang, Xin
    Gao, Tian
    Zou, Liang
    Ling, Zhenhua
    [J]. SENSORS, 2020, 20 (23) : 1 - 17