Robust Speaker Recognition Using Denoised Vocal Source and Vocal Tract Features

被引：34

作者：

Wang, Ning ^{[1
]}

Ching, P. C. ^{[1
]}

Zheng, Nengheng ^{[2
]}

Lee, Tan ^{[1
]}

机构：

[1] Chinese Univ Hong Kong, Dept Elect Engn, Shatin, Hong Kong, Peoples R China

[2] Shenzhen Univ, Coll Informat Engn, Shenzhen 518060, Peoples R China

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2011年 / 19卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Robust parameter estimation; source-tract features; speaker recognition; spectral subtraction; REPRESENTATIONS; NOISE;

D O I：

10.1109/TASL.2010.2045800

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

To alleviate the problem of severe degradation of speaker recognition performance under noisy environments because of inadequate and inaccurate speaker-discriminative information, a method of robust feature estimation that can capture both vocal source-and vocal tract-related characteristics from noisy speech utterances is proposed. Spectral subtraction, a simple yet useful speech enhancement technique, is employed to remove the noise-specific components prior to the feature extraction process. It has been shown through analytical derivation, as well as by simulation results, that the proposed feature estimation method leads to robust recognition performance, especially at low signal-to-noise ratios. In the context of Gaussian mixture model-based speaker recognition with the presence of additive white Gaussian noise, the new approach produces consistent reduction of both identification error rate and equal error rate at signal-to-noise ratios ranging from 0 to 15 dB.

引用

页码：196 / 205

页数：10

共 50 条

[41] SOURCE-SYSTEM INTERACTION IN VOCAL TRACT
FLANAGAN, JL
ANNALS OF THE NEW YORK ACADEMY OF SCIENCES, 1968, 155 (A1) : 9 - &
[42] 'Mixing' the registers: Glottal source or vocal tract?
Miller, DG
Schutte, HK
FOLIA PHONIATRICA ET LOGOPAEDICA, 2005, 57 (5-6) : 278 - 291
[43] Fuzzy Phoneme Classification Using Multi-speaker Vocal Tract Length Normalization
Lung, Jensen Wong Jing
Salam, Md Sah Hj
Rehman, Amjad
Rahim, Mohd Shafry Mohd
Saba, Tanzila
IETE TECHNICAL REVIEW, 2014, 31 (02) : 128 - 136
[44] Vocal Style Factorization for Effective Speaker Recognition in Affective Scenarios
Sandler, Morgan
Ross, Arun
2023 IEEE INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS, IJCB, 2023,
[45] DEEPTALK: VOCAL STYLE ENCODING FOR SPEAKER RECOGNITION AND SPEECH SYNTHESIS
Chowdhury, Anurag
Ross, Arun
David, Prabu
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6189 - 6193
[46] An Approach to Vocal Tract Length Normalization by Robust Formant
Kabir, A.
Barker, J.
Giurgiu, M.
RECENT ADVANCES IN CIRCUITS, SYSTEMS AND SIGNALS, 2010, : 345 - +
[47] Automated Vocal Emotion Recognition Using Phoneme Class Specific Features
Kiss, Geza
van Santen, Jan
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1161 - 1164
[48] ON THE IMPORTANCE OF VOCAL TRACT CONSTRICTION FOR SPEAKER CHARACTERIZATION: THE WHISPERED SPEECH STUDY
Das, Rohan Kumar
Li, Haizhou
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7119 - 7123
[49] Regional Resonance of the Lower Vocal Tract and its Contribution to Speaker Characteristics
Zhang, Lin
Honda, Kiyoshi
Wei, Jianguo
Adachi, Seiji
INTERSPEECH 2020, 2020, : 1391 - 1395
[50] BAYESIAN VOCAL TRACT MODEL ESTIMATES OF NASAL STOPS FOR SPEAKER VERIFICATION
Enzinger, Ewald
Kasess, Christian H.
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,

← 1 2 3 4 5 →