Robust Speaker Recognition Using Spectro-Temporal Autoregressive Models

被引：0

作者：

Mallidi, Sri Harish ^{[1
]}

Ganapathy, Sriram ^{[2
]}

Hermansky, Hynek ^{[1
]}

机构：

[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA

[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY USA

来源：

14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年

关键词：

Rate-Scale Filtering; Autoregressive Modeling; Speaker Recognition; Robust Feature Extraction;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Speaker recognition in noisy environments is challenging when there is a mis-match in the data used for enrollment and verification. In this paper, we propose a robust feature extraction scheme based on spectro-temporal modulation filtering using two-dimensional (2-D) autoregressive (AR) models. The first step is the AR modeling of the sub-band temporal envelopes by the application of the linear prediction on the sub-band discrete cosine transform (DCT) components. These sub-band envelopes are stacked together and used for a second AR modeling step. The spectral envelope across the sub-bands is approximated in this AR model and cepstral features are derived which are used for speaker recognition. The use of AR models emphasizes the focus on the high energy regions which are relatively well preserved in the presence of noise. The degree of modulation filtering is controlled using AR model order parameter. Experiments are performed using noisy versions of NIST 2010 speaker recognition evaluation (SRE) data with a state of -art speaker recognition system. In these experiments, the proposed features provide significant improvements compared to baseline features (relative improvements of 20% in terms of equal error rate (EER) and 35 % in terms of miss rate at 10 % false alarm).

引用

页码：3656 / 3660

页数：5

共 50 条

[1] ROBUST SPECTRO-TEMPORAL FEATURES BASED ON AUTOREGRESSIVE MODELS OF HILBERT ENVELOPES
Ganapathy, Sriram
Thomas, Samuel
Hermansky, Hynek
[J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4286 - 4289
[2] SPECTRO-TEMPORAL GABOR FEATURES FOR SPEAKER RECOGNITION
Lei, Howard
Meyer, Bernd T.
Mirghafori, Nikki
[J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4241 - 4244
[3] Hierarchical spectro-temporal features for robust speech recognition
Domont, Xavier
Heckmann, Martin
Joublin, Frank
Goerick, Christian
[J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4417 - 4420
[4] Spectro-Temporal Modulations for Robust Speech Emotion Recognition
Yeh, Lan-Ying
Chi, Tai-Shih
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 789 - 792
[5] Spectro-temporal modulation energy based mask for robust speaker identification
Chi, Tai-Shih
Lin, Ting-Han
Hsu, Chung-Chien
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2012, 131 (05): : EL368 - EL374
[6] Spectro-Temporal Features for Robust Far-Field Speaker Identification
Falk, Tiago H.
Chan, Wai-Yip
[J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 634 - 637
[7] Speaker Adaptation Using Spectro-Temporal Deep Features for Dysarthric and Elderly Speech Recognition
Geng, Mengzhe
Xie, Xurong
Ye, Zi
Wang, Tianzi
Li, Guinan
Hu, Shujie
Liu, Xunying
Meng, Helen
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2597 - 2611
[8] Feature Adaptation Using Linear Spectro-Temporal Transform for Robust Speech Recognition
Duc Hoang Ha Nguyen
Xiao, Xiong
Chng, Eng Siong
Li, Haizhou
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (06) : 1006 - 1019
[9] Robust emotion recognition by spectro-temporal modulation statistic features
Tai-Shih Chi
Lan-Ying Yeh
Chin-Cheng Hsu
[J]. Journal of Ambient Intelligence and Humanized Computing, 2012, 3 : 47 - 60
[10] Robust emotion recognition by spectro-temporal modulation statistic features
Chi, Tai-Shih
Yeh, Lan-Ying
Hsu, Chin-Cheng
[J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2012, 3 (01) : 47 - 60

← 1 2 3 4 5 →