A comparative study of noise estimation algorithms for nonlinear compensation in robust speech recognition

被引:0
|
作者
Zhao, Yong [1 ]
Juang, Biing-Hwang [2 ]
机构
[1] Microsoft Corp, One Microsoft Way, Redmond, WA 98052 USA
[2] Georgia Inst Technol, Dept Elect & Comp Engn, Atlanta, GA 30332 USA
关键词
Factor analysis; Gauss-Newton method; Nonlinear compensation; Parallel model combination; Robust speech recognition; Vector Taylor series; MAXIMUM-LIKELIHOOD; ENVIRONMENTS; SIGNAL;
D O I
10.1016/j.specom.2017.02.005
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Nonlinear compensation models make use of a nonlinear mismatch function, which characterizes the joint effects of additive and convolutional noise, to realize noise-robust speech recognition. Representative compensation models consist of vector Taylor series (VTS), data-driven parallel model combination (DPMC), and unscented transform (UT). The noise parameters of the compensation models, often estimated in the maximum likelihood (ML) sense, are known to play an important role on the system performance in noisy conditions. In this paper, we conduct a systematic comparison between two popular approaches for estimating the noise parameters. The first approach employs the Gauss-Newton method in a generalized EM framework to iteratively maximizing the EM auxiliary function. The second approach views the compensation models from a generative perspective, giving rise to an EM algorithm, analogous to the ML estimation for factor analysis (EM-FA). We demonstrate a close connection between these two approaches: they belong to the family of gradient-based methods except with different convergence rates. Note that the convergence property can be crucial to the noise estimation since model compensation may be frequently carried out in changing noisy environments for retaining desired performance. Furthermore, we present an in-depth discussion on the advantages and limitations of the two approaches, and illustrate how to extend these approaches to allow for adaptive training. The investigated noise estimation approaches are evaluated on several tasks. The first is to fit a GMM model to artificially corrupted samples, and then speech recognition are performed on the Aurora 2 and Aurora 4 tasks. (C) 2017 Published by Elsevier B.V.
引用
收藏
页码:58 / 69
页数:12
相关论文
共 50 条
  • [1] A Comparative Study of Noise Estimation Algorithms for VTS-Based Robust Speech Recognition
    Zhao, Yong
    Juang, Biing-Hwang
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2090 - 2093
  • [2] Comparative Study on Channel Compensation for Robust Speech Recognition
    赵军辉
    匡镜明
    黄石磊
    [J]. Journal of Beijing Institute of Technology, 2003, (04) : 403 - 406
  • [3] Feature compensation based on independent noise estimation for robust speech recognition
    Lu, Yong
    Lin, Han
    Wu, Pingping
    Chen, Yitao
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [4] Feature compensation based on independent noise estimation for robust speech recognition
    Yong Lü
    Han Lin
    Pingping Wu
    Yitao Chen
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [5] NOISE ADAPTATION ALGORITHMS FOR ROBUST SPEECH RECOGNITION
    CUNG, HM
    NORMANDIN, Y
    [J]. SPEECH COMMUNICATION, 1993, 12 (03) : 267 - 276
  • [6] Residual noise compensation for robust speech recognition in nonstationary noise
    Yao, KS
    Shi, BE
    Fung, P
    Cao, ZG
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1125 - 1128
  • [7] SPECTRAL ESTIMATION FOR NOISE ROBUST SPEECH RECOGNITION
    ERELL, A
    WEINTRAUB, M
    [J]. SPEECH AND NATURAL LANGUAGE, 1989, : 319 - 324
  • [8] Nonlinear Compensation Using the Gauss-Newton Method for Noise-Robust Speech Recognition
    Zhao, Yong
    Juang, Biing-Hwang
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (08): : 2191 - 2206
  • [9] Online feature compensation using modified quantile based noise estimation for robust speech recognition
    Lee, Heungkyu
    Kwon, Ohil
    Kim, June
    [J]. ADVANCES IN INTELLIGENT IT: ACTIVE MEDIA TECHNOLOGY 2006, 2006, 138 : 236 - 242
  • [10] Signal trajectory based noise compensation for robust speech recognition
    Yan, Zhi-Jie
    Zhou, Jian-Lai
    Soong, Frank
    Wang, Ren-Hua
    [J]. CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 335 - +