ON NOISE ESTIMATION FOR ROBUST SPEECH RECOGNITION USING VECTOR TAYLOR SERIES

被引：8

作者：

Zhao, Yong ^{[1
]}

Juang, Biing-Hwang ^{[1
]}

机构：

[1] Georgia Inst Technol, Ctr Signal & Image Proc, Atlanta, GA 30332 USA

来源：

2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2010年

关键词：

Robust speech recognition; vector Taylor series; noise estimation;

D O I：

10.1109/ICASSP.2010.5495669

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose a novel noise variance estimation method using the fixed point method for the VTS-based robust speech recognition. Noise parameters are re-estimated over a given utterance using an EM algorithm. The derivative of the auxiliary function with respect to the noise variance is resolved, and the fixed point algorithm estimates the noise variance by recursively approximating the root of the resulting derivative. The method leads to a re-estimation formula with a flavor like the standard ML variance estimation, and the iteration procedure is step-size free. We also investigate improving the noise estimation for efficient VTS adaptation. Several fast noise estimation methods are examined including estimation from non-speech areas and incremental adaptation. In the evaluation over Aurora 2 database, the proposed noise variance estimation method obtains a significant improvement in recognition accuracy over the method using sample variance. Further experiments show that the VTS ML estimation over non-speech areas is an effective fast adaptation method. The final refined approach achieves 8.75% WER, 13% relative improvement over the conventional VTS adaptation.

引用

页码：4290 / 4293

页数：4

共 50 条

[1] NOISE ADAPTIVE TRAINING USING A VECTOR TAYLOR SERIES APPROACH FOR NOISE ROBUST AUTOMATIC SPEECH RECOGNITION
Kalinli, Ozlem
Seltzer, Michael L.
Acero, Alex
[J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3825 - 3828
[2] Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition
Das, Biswajit
Panda, Ashish
[J]. 2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
[3] Use of Generalised Nonlinearity in Vector Taylor Series Noise Compensation for Robust Speech Recognition
Loweimi, Erfan
Barker, Jon
Hain, Thomas
[J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3798 - 3802
[4] A new algorithm using improved Vector Taylor Series for robust speech recognition
Li, YY
Li, B
Wang, CY
Tang, CJ
[J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS, INTELLIGENT SYSTEMS AND SIGNAL PROCESSING, VOLS 1 AND 2, PROCEEDINGS, 2003, : 1146 - 1150
[5] A NOISE ROBUST I-VECTOR EXTRACTOR USING VECTOR TAYLOR SERIES FOR SPEAKER RECOGNITION
Lei, Yun
Burget, Lukas
Scheffer, Nicolas
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6788 - 6791
[6] Robust Speech Recognition Using Improved Vector Taylor Series Algorithm for Embedded Systems
Lue, Yong
Wu, Haiyang
Wu, Zhenyang
[J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2010, 56 (02) : 764 - 769
[7] SECOND ORDER VECTOR TAYLOR SERIES BASED ROBUST SPEECH RECOGNITION
Bu, Suliang
Qian, Yanmin
Sim, Khe Chai
You, Yongbin
Yu, Kai
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[8] A NOISE-ROBUST SPEECH RECOGNITION METHOD COMPOSED OF WEAK NOISE SUPPRESSION AND WEAK VECTOR TAYLOR SERIES ADAPTATION
Komeiji, Shuji
Arakawa, Takayuki
Koshinaka, Takafumi
[J]. 2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 103 - 106
[9] Taylor Series Expansion of Psychoacoustic Corruption Function for Noise Robust Speech Recognition
Das, Biswajit
Panda, Ashish
[J]. PROCEEDINGS OF 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2016), 2016, : 568 - 572
[10] NOISE ADAPTIVE FRONT-END NORMALIZATION BASED ON VECTOR TAYLOR SERIES FOR DEEP NEURAL NETWORKS IN ROBUST SPEECH RECOGNITION
Bo Li
Chai, Khe Sim
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7408 - 7412

← 1 2 3 4 5 →