New Features Using Robust MVDR Spectrum of Filtered Autocorrelation Sequence for Robust Speech Recognition

被引：5

作者：

Seyedin, Sanaz ^{[1
]}

Ahadi, Seyed Mohammad ^{[2
]}

Gazor, Saeed ^{[1
]}

机构：

[1] Queens Univ, Dept Elect & Comp Engn, Kingston, ON K7L 3N6, Canada

[2] Amirkabir Univ Technol, Dept Elect Engn, Tehran 15914, Iran

来源：

SCIENTIFIC WORLD JOURNAL | 2013年

关键词：

FEATURE-EXTRACTION; FRONT-END; COEFFICIENTS;

D O I：

10.1155/2013/634160

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

This paper presents a novel noise-robust feature extraction method for speech recognition using the robust perceptual minimum variance distortionless response (MVDR) spectrum of temporally filtered autocorrelation sequence. The perceptual MVDR spectrum of the filtered short-time autocorrelation sequence can reduce the effects of residue of the nonstationary additive noise which remains after filtering the autocorrelation. To achieve a more robust front-end, we also modify the robust distortionless constraint of the MVDR spectral estimation method via revised weighting of the subband power spectrum values based on the sub-band signal to noise ratios (SNRs), which adjusts it to the new proposed approach. This new function allows the components of the input signal at the frequencies least affected by noise to pass with larger weights and attenuates more effectively the noisy and undesired components. This modification results in reduction of the noise residuals of the estimated spectrum from the filtered autocorrelation sequence, thereby leading to a more robust algorithm. Our proposed method, when evaluated on Aurora 2 task for recognition purposes, outperformed all Mel frequency cepstral coefficients (MFCC) as the baseline, relative autocorrelation sequence MFCC (RAS-MFCC), and the MVDR-based features in several different noisy conditions.

引用

页数：11

共 50 条

[41] Temporal Envelope Subtraction for Robust Speech Recognition Using Modulation Spectrum
Ganapathy, Sriram
Thomas, Samuel
Hermansky, Hynek
[J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 164 - 169
[42] A robust speech recognition system using FRM running spectrum filtering
Hayasaka, N
Wada, N
Yoshizawa, S
Miyanaga, Y
[J]. ISCCSP : 2004 FIRST INTERNATIONAL SYMPOSIUM ON CONTROL, COMMUNICATIONS AND SIGNAL PROCESSING, 2004, : 401 - 404
[43] Robust speech recognition features based on temporal trajectory filtering of frequency band spectrum
Shen, JL
Hwang, WL
Lee, LS
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 881 - 884
[44] Quality-Aware Bag of Modulation Spectrum Features for Robust Speech Emotion Recognition
Kshirsagar, Shruti Rajendra
Falk, Tiago Henrik
[J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (04) : 1892 - 1905
[45] DEEP CONVOLUTIONAL NETS AND ROBUST FEATURES FOR REVERBERATION-ROBUST SPEECH RECOGNITION
Mitra, Vikramjit
Wang, Wen
Franco, Horacio
[J]. 2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 548 - 553
[46] ON TIME-FREQUENCY MASK ESTIMATION FOR MVDR BEAMFORMING WITH APPLICATION IN ROBUST SPEECH RECOGNITION
Xiao, Xiong
Zhao, Shengkui
Jones, Douglas L.
Chng, Eng Siong
Li, Haizhou
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 3246 - 3250
[47] On properties of modulation spectrum for robust automatic speech recognition
Kanedera, N
Hermansky, H
Arai, T
[J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 613 - 616
[48] Spectrum enhancement with sparse coding for robust speech recognition
He, Yongjun
Sun, Guanglu
Han, Jiqing
[J]. DIGITAL SIGNAL PROCESSING, 2015, 43 : 59 - 70
[49] Modulation Spectrum Equalization for Improved Robust Speech Recognition
Sun, Liang-Che
Lee, Lin-Shan
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (03): : 828 - 843
[50] Modulation spectrum exponential weighting for robust speech recognition
Fan, Hao-teng
Lian, Yi-cheng
Hung, Jeih-weih
[J]. 2012 12TH INTERNATIONAL CONFERENCE ON ITS TELECOMMUNICATIONS (ITST-2012), 2012, : 812 - 816

← 1 2 3 4 5 →