A Variational Approach to Robust Maximum Likelihood Estimation for Speech Recognition

被引：0

作者：

Omar, Mohamed Kamal ^{[1
]}

机构：

[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA

来源：

INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 | 2007年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In many automatic speech recognition (ASR) applications, the data used to estimate the class-conditional feature probability density function (PDF) is noisy, and the test data is mismatched with the training data. Previous research has shown that the effect of this problem may be reduced by using models which take the effect of the noise into consideration, and by transforming the features or the models used in the classifier to adapt to new environments and speakers. This paper addresses the degradation in the performance of ASR systems due to small-possibly time-varying-perturbations of the training data. To approach this problem, we provide a computationally efficient algorithm for estimating the model parameters which maximize the sum of the log likelihood and the negative of a measure of the sensitivity of the estimated likelihood to these perturbations. This approach does not make any assumptions about the noise model during training. We present several large vocabulary speech recognition experiments that show significant recognition accuracy improvement compared to using the baseline maximum likelihood (ML) models.

引用

页码：1049 / 1052

页数：4

共 50 条

[1] Maximum likelihood joint estimation of channel and noise for robust speech recognition
Zhao, YX
[J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1109 - 1112
[2] Maximum-likelihood approach to stochastic matching for robust speech recognition
Sankar, A
Lee, CH
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1996, 4 (03): : 190 - 202
[3] Signal bias removal by maximum likelihood estimation for robust telephone speech recognition
Rahim, MG
Juang, BH
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1996, 4 (01): : 19 - 30
[4] Maximum likelihood polynomial regression for robust speech recognition
L Yong WU Zhenyang (School of Information Science and Engineering
[J]. Chinese Journal of Acoustics, 2011, 30 (03) : 358 - 370
[5] Maximum likelihood subband polynomial regression for robust speech recognition
Lu, Yong
Wu, Zhenyang
[J]. APPLIED ACOUSTICS, 2013, 74 (05) : 640 - 646
[6] A MAXIMUM-LIKELIHOOD APPROACH TO CONTINUOUS SPEECH RECOGNITION
BAHL, LR
JELINEK, F
MERCER, RL
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1983, 5 (02) : 179 - 190
[7] ROBUST ESTIMATION - A WEIGHTED MAXIMUM-LIKELIHOOD APPROACH
FIELD, C
SMITH, B
[J]. INTERNATIONAL STATISTICAL REVIEW, 1994, 62 (03) : 405 - 424
[8] Convolutional Maximum-Likelihood Distortionless Response Beamforming With Steering Vector Estimation for Robust Speech Recognition
Cho, Byung Joon
Park, Hyung-Min
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 1352 - 1367
[9] MAXIMUM LIKELIHOOD ADAPTATION OF HISTOGRAM EQUALIZATION WITH CONSTRAINT FOR ROBUST SPEECH RECOGNITION
Xiao, Xiong
Li, Jinyu
Chng, Eng Siong
Li, Haizhou
[J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5480 - 5483
[10] Maximum likelihood sub-band adaptation for robust speech recognition
Zhu, DL
Nakamura, S
Paliwal, KK
Wang, RH
[J]. SPEECH COMMUNICATION, 2005, 47 (03) : 243 - 264

← 1 2 3 4 5 →