Maximum likelihood subband polynomial regression for robust speech recognition

被引:3
|
作者
Lu, Yong [1 ,2 ]
Wu, Zhenyang [2 ]
机构
[1] Hohai Univ, Coll Comp & Informat Engn, Nanjing 210098, Jiangsu, Peoples R China
[2] Southeast Univ, Sch Informat Sci & Engn, Nanjing 210096, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Model adaptation; Subband polynomial regression; Hidden Markov model; Robust speech recognition; NOISE; ADAPTATION; TRANSFORMATION; MIXTURE;
D O I
10.1016/j.apacoust.2012.11.016
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a model adaptation algorithm based on maximum likelihood subband polynomial regression (MLSPR) for robust speech recognition. In this algorithm, the cepstral mean vectors of prior trained hidden Markov models (HMMs) are converted to the log-spectral domain by the inverse discrete cosine transform (DCT) and each log-spectral mean vector is divided into several subband vectors. The relationship between the training and testing subband vectors is approximated by a polynomial function. The polynomial coefficients are estimated from adaptation data using the expectation-maximization (EM) algorithm under the maximum likelihood (ML) criterion. The experimental results show that the proposed MLSPR algorithm is superior to both the maximum likelihood linear regression (MLLR) adaptation and maximum likelihood subband weighting (MLSW) approach. In the MLSPR adaptation, only a very small amount of adaptation data is required and therefore it is more useful for fast model adaptation. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:640 / 646
页数:7
相关论文
共 50 条
  • [1] Maximum likelihood polynomial regression for robust speech recognition
    L Yong WU Zhenyang (School of Information Science and Engineering
    [J]. Chinese Journal of Acoustics, 2011, 30 (03) : 358 - 370
  • [2] Noisy Constrained Maximum-Likelihood Linear Regression for Noise-Robust Speech Recognition
    Kim, D. K.
    Gales, M. J. F.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (02): : 315 - 325
  • [3] Adaptive Training with Noisy Constrained Maximum Likelihood Linear Regression for Noise Robust Speech Recognition
    Kim, D. K.
    Gales, M. J. F.
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2367 - 2370
  • [4] JOINT CONSTRAINED MAXIMUM LIKELIHOOD REGRESSION FOR OVERLAPPING SPEECH RECOGNITION
    Kumatani, Kenichi
    Singh, Rita
    Faubel, Friedrich
    McDonough, John
    Oualil, Youssef
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 121 - 125
  • [5] REGULARIZED CONSTRAINED MAXIMUM LIKELIHOOD LINEAR REGRESSION FOR SPEECH RECOGNITION
    Ghalehjegh, Sina Hamidi
    Rose, Richard C.
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [6] A Variational Approach to Robust Maximum Likelihood Estimation for Speech Recognition
    Omar, Mohamed Kamal
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1049 - 1052
  • [7] Subband correlation and robust speech recognition
    McAuley, J
    Ming, J
    Stewart, D
    Hanna, P
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 956 - 964
  • [8] Model Adaptation Algorithm Based on Central Subband Regression for Robust Speech Recognition
    Lu, Yong
    Zhou, Lin
    [J]. 2014 SEVENTH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2014), VOL 2, 2014,
  • [9] MAXIMUM LIKELIHOOD ADAPTATION OF HISTOGRAM EQUALIZATION WITH CONSTRAINT FOR ROBUST SPEECH RECOGNITION
    Xiao, Xiong
    Li, Jinyu
    Chng, Eng Siong
    Li, Haizhou
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5480 - 5483
  • [10] Maximum likelihood sub-band adaptation for robust speech recognition
    Zhu, DL
    Nakamura, S
    Paliwal, KK
    Wang, RH
    [J]. SPEECH COMMUNICATION, 2005, 47 (03) : 243 - 264