ON NOISE ESTIMATION FOR ROBUST SPEECH RECOGNITION USING VECTOR TAYLOR SERIES

被引:8
|
作者
Zhao, Yong [1 ]
Juang, Biing-Hwang [1 ]
机构
[1] Georgia Inst Technol, Ctr Signal & Image Proc, Atlanta, GA 30332 USA
关键词
Robust speech recognition; vector Taylor series; noise estimation;
D O I
10.1109/ICASSP.2010.5495669
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a novel noise variance estimation method using the fixed point method for the VTS-based robust speech recognition. Noise parameters are re-estimated over a given utterance using an EM algorithm. The derivative of the auxiliary function with respect to the noise variance is resolved, and the fixed point algorithm estimates the noise variance by recursively approximating the root of the resulting derivative. The method leads to a re-estimation formula with a flavor like the standard ML variance estimation, and the iteration procedure is step-size free. We also investigate improving the noise estimation for efficient VTS adaptation. Several fast noise estimation methods are examined including estimation from non-speech areas and incremental adaptation. In the evaluation over Aurora 2 database, the proposed noise variance estimation method obtains a significant improvement in recognition accuracy over the method using sample variance. Further experiments show that the VTS ML estimation over non-speech areas is an effective fast adaptation method. The final refined approach achieves 8.75% WER, 13% relative improvement over the conventional VTS adaptation.
引用
收藏
页码:4290 / 4293
页数:4
相关论文
共 50 条
  • [1] NOISE ADAPTIVE TRAINING USING A VECTOR TAYLOR SERIES APPROACH FOR NOISE ROBUST AUTOMATIC SPEECH RECOGNITION
    Kalinli, Ozlem
    Seltzer, Michael L.
    Acero, Alex
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3825 - 3828
  • [2] Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition
    Das, Biswajit
    Panda, Ashish
    [J]. 2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [3] Use of Generalised Nonlinearity in Vector Taylor Series Noise Compensation for Robust Speech Recognition
    Loweimi, Erfan
    Barker, Jon
    Hain, Thomas
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3798 - 3802
  • [4] A new algorithm using improved Vector Taylor Series for robust speech recognition
    Li, YY
    Li, B
    Wang, CY
    Tang, CJ
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS, INTELLIGENT SYSTEMS AND SIGNAL PROCESSING, VOLS 1 AND 2, PROCEEDINGS, 2003, : 1146 - 1150
  • [5] A NOISE ROBUST I-VECTOR EXTRACTOR USING VECTOR TAYLOR SERIES FOR SPEAKER RECOGNITION
    Lei, Yun
    Burget, Lukas
    Scheffer, Nicolas
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6788 - 6791
  • [6] Robust Speech Recognition Using Improved Vector Taylor Series Algorithm for Embedded Systems
    Lue, Yong
    Wu, Haiyang
    Wu, Zhenyang
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2010, 56 (02) : 764 - 769
  • [7] SECOND ORDER VECTOR TAYLOR SERIES BASED ROBUST SPEECH RECOGNITION
    Bu, Suliang
    Qian, Yanmin
    Sim, Khe Chai
    You, Yongbin
    Yu, Kai
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [8] A NOISE-ROBUST SPEECH RECOGNITION METHOD COMPOSED OF WEAK NOISE SUPPRESSION AND WEAK VECTOR TAYLOR SERIES ADAPTATION
    Komeiji, Shuji
    Arakawa, Takayuki
    Koshinaka, Takafumi
    [J]. 2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 103 - 106
  • [9] Taylor Series Expansion of Psychoacoustic Corruption Function for Noise Robust Speech Recognition
    Das, Biswajit
    Panda, Ashish
    [J]. PROCEEDINGS OF 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2016), 2016, : 568 - 572
  • [10] NOISE ADAPTIVE FRONT-END NORMALIZATION BASED ON VECTOR TAYLOR SERIES FOR DEEP NEURAL NETWORKS IN ROBUST SPEECH RECOGNITION
    Bo Li
    Chai, Khe Sim
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7408 - 7412