Robust Methods for Text-Dependent Speaker Verification

被引:1
|
作者
Bhukya, Ramesh K. [1 ]
Prasanna, S. R. Mahadeva [1 ,2 ]
Sarma, Biswajit Dev [3 ]
机构
[1] Indian Inst Technol Guwahati, Dept Elect & Elect Engn, Electro Med & Speech Technol Lab, Gauhati 781039, India
[2] Indian Inst Technol Dharwad, Dept Elect Engn, Dharwad 580011, Karnataka, India
[3] Bay Area Adv Analyt India P Ltd, Gauhati 781039, India
关键词
End point detection; VLRs; Dominant resonant frequency; Glottal activity detection; Foreground speech segmentation; MEMD; IMFs; Hilbert spectrum; MFCCs; TDSV; DTW; EMPIRICAL MODE DECOMPOSITION; END-POINT DETECTION; SPEECH; RECOGNITION; VOWEL;
D O I
10.1007/s00034-019-01125-x
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this work, we explore various noise robust techniques at different stages of a Text-Dependent Speaker Verification (TDSV) system. A speech-specific knowledge-based robust end points detection technique is used for noise compensation at signal level. Feature-level compensation is done by using robust features extracted from Hilbert Spectrum (HS) of the Intrinsic Mode Functions obtained from Modified Empirical Mode Decomposition of speech. We also explored a combined temporal and spectral speech enhancement technique prior to the end points detection for enhancing speech regions embedded in noise. All experimental studies are conducted using two databases, namely the RSR2015 and the IITG database. It is found that the use of robust end points detection improves the performance of the TDSV system compared to the energy-based end points detection in both clean and degraded speech conditions. Use of noise robust HS features augmented with Mel-frequency cepstral coefficients further improves the performance of the system. It is also found that the use of speech enhancement prior to signal and feature-level compensation results in further improvement in performance for the low SNR cases. The final combined system obtained by using three robust methods provides a relative improvement from 6 to 25% in terms of the EER, on the RSR2015 database corrupted with Babble noise of varying strength and by around from 30 to 45% relative improvement on the IITG database.
引用
收藏
页码:5253 / 5288
页数:36
相关论文
共 50 条
  • [1] Robust Methods for Text-Dependent Speaker Verification
    Ramesh K. Bhukya
    S. R. Mahadeva Prasanna
    Biswajit Dev Sarma
    Circuits, Systems, and Signal Processing, 2019, 38 : 5253 - 5288
  • [2] Text-dependent speaker verification system
    Qin, Bing
    Chen, Huipeng
    Li, Guangqi
    Liu, Songbo
    Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2000, 32 (04): : 16 - 18
  • [3] Text-Dependent Speaker Verification System: A Review
    Debnath, Saswati
    Soni, B.
    Baruah, U.
    Sah, D. K.
    PROCEEDINGS OF 2015 IEEE 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO), 2015,
  • [4] Deep feature for text-dependent speaker verification
    Liu, Yuan
    Qian, Yanmin
    Chen, Nanxin
    Fu, Tianfan
    Zhang, Ya
    Yu, Kai
    SPEECH COMMUNICATION, 2015, 73 : 1 - 13
  • [5] Bidirectional Attention for Text-Dependent Speaker Verification
    Fang, Xin
    Gao, Tian
    Zou, Liang
    Ling, Zhenhua
    SENSORS, 2020, 20 (23) : 1 - 17
  • [6] Content Normalization for Text-dependent Speaker Verification
    Dey, Subhadeep
    Madikeri, Srikanth
    Motlicek, Petr
    Ferras, Marc
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1482 - 1486
  • [7] IMPOSTURE CLASSIFICATION FOR TEXT-DEPENDENT SPEAKER VERIFICATION
    Larcher, Anthony
    Lee, Kong Aik
    Ma, Bin
    Li, Haizhou
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [8] COMPARISON OF MULTIPLE FEATURES AND MODELING METHODS FOR TEXT-DEPENDENT SPEAKER VERIFICATION
    Liu, Yi
    He, Liang
    Tian, Yao
    Chen, Zhuzi
    Liu, Jia
    Johnson, Michael T.
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 629 - 636
  • [9] Parallel Speaker and Content Modelling for Text-dependent Speaker Verification
    Ma, Jianbo
    Irtza, Saad
    Sriskandaraja, Kaavya
    Sethu, Vidhyasaharan
    Ambikairajah, Eliathamby
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 435 - 439
  • [10] A Survey on Text-Dependent and Text-Independent Speaker Verification
    Tu, Youzhi
    Lin, Weiwei
    Mak, Man-Wai
    IEEE ACCESS, 2022, 10 : 99038 - 99049