Robust Speaker Recognition Using Denoised Vocal Source and Vocal Tract Features

被引:34
|
作者
Wang, Ning [1 ]
Ching, P. C. [1 ]
Zheng, Nengheng [2 ]
Lee, Tan [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Elect Engn, Shatin, Hong Kong, Peoples R China
[2] Shenzhen Univ, Coll Informat Engn, Shenzhen 518060, Peoples R China
基金
中国国家自然科学基金;
关键词
Robust parameter estimation; source-tract features; speaker recognition; spectral subtraction; REPRESENTATIONS; NOISE;
D O I
10.1109/TASL.2010.2045800
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
To alleviate the problem of severe degradation of speaker recognition performance under noisy environments because of inadequate and inaccurate speaker-discriminative information, a method of robust feature estimation that can capture both vocal source-and vocal tract-related characteristics from noisy speech utterances is proposed. Spectral subtraction, a simple yet useful speech enhancement technique, is employed to remove the noise-specific components prior to the feature extraction process. It has been shown through analytical derivation, as well as by simulation results, that the proposed feature estimation method leads to robust recognition performance, especially at low signal-to-noise ratios. In the context of Gaussian mixture model-based speaker recognition with the presence of additive white Gaussian noise, the new approach produces consistent reduction of both identification error rate and equal error rate at signal-to-noise ratios ranging from 0 to 15 dB.
引用
收藏
页码:196 / 205
页数:10
相关论文
共 50 条
  • [41] SOURCE-SYSTEM INTERACTION IN VOCAL TRACT
    FLANAGAN, JL
    ANNALS OF THE NEW YORK ACADEMY OF SCIENCES, 1968, 155 (A1) : 9 - &
  • [42] 'Mixing' the registers: Glottal source or vocal tract?
    Miller, DG
    Schutte, HK
    FOLIA PHONIATRICA ET LOGOPAEDICA, 2005, 57 (5-6) : 278 - 291
  • [43] Fuzzy Phoneme Classification Using Multi-speaker Vocal Tract Length Normalization
    Lung, Jensen Wong Jing
    Salam, Md Sah Hj
    Rehman, Amjad
    Rahim, Mohd Shafry Mohd
    Saba, Tanzila
    IETE TECHNICAL REVIEW, 2014, 31 (02) : 128 - 136
  • [44] Vocal Style Factorization for Effective Speaker Recognition in Affective Scenarios
    Sandler, Morgan
    Ross, Arun
    2023 IEEE INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS, IJCB, 2023,
  • [45] DEEPTALK: VOCAL STYLE ENCODING FOR SPEAKER RECOGNITION AND SPEECH SYNTHESIS
    Chowdhury, Anurag
    Ross, Arun
    David, Prabu
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6189 - 6193
  • [46] An Approach to Vocal Tract Length Normalization by Robust Formant
    Kabir, A.
    Barker, J.
    Giurgiu, M.
    RECENT ADVANCES IN CIRCUITS, SYSTEMS AND SIGNALS, 2010, : 345 - +
  • [47] Automated Vocal Emotion Recognition Using Phoneme Class Specific Features
    Kiss, Geza
    van Santen, Jan
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1161 - 1164
  • [48] ON THE IMPORTANCE OF VOCAL TRACT CONSTRICTION FOR SPEAKER CHARACTERIZATION: THE WHISPERED SPEECH STUDY
    Das, Rohan Kumar
    Li, Haizhou
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7119 - 7123
  • [49] Regional Resonance of the Lower Vocal Tract and its Contribution to Speaker Characteristics
    Zhang, Lin
    Honda, Kiyoshi
    Wei, Jianguo
    Adachi, Seiji
    INTERSPEECH 2020, 2020, : 1391 - 1395
  • [50] BAYESIAN VOCAL TRACT MODEL ESTIMATES OF NASAL STOPS FOR SPEAKER VERIFICATION
    Enzinger, Ewald
    Kasess, Christian H.
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,