Linear spectral transformation for robust speech recognition using maximum mutual information

被引:7
|
作者
Kim, Donghyun [1 ]
Yook, Dongsuk [1 ]
机构
[1] Korea Univ, Dept Comp Sci & Engn, Seoul 136701, South Korea
关键词
linear spectral transformation; maximum mutual information (MMI); rapid adaptation; robust speech recognition;
D O I
10.1109/LSP.2006.891337
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents a transformation-based rapid adaptation technique for robust speech recognition using a linear spectral transformation (LST) and a maximum mutual information (NIMI) criterion. Previously, a maximum likelihood linear spectral transformation (ML-LST) algorithm was proposed for fast adaptation in unknown environments. Since the NIMI estimation methods does not require evenly distributed training data and increases the a posteriori probability of the word sequences of the training data, we combine the linear spectral transformation method and the MMI estimation technique in order to achieve extremely rapid adaptation using only one word of adaptation data. The proposed algorithm, called MMI-LST, was implemented using the extended Baum-Welch algorithm and phonetic lattices, and evaluated on the TIMIT and FFMTIMIT corpora. It provides a relative reducion in the speech recognition error rate of 11.1% using only 0.25 s of adaptation data.
引用
收藏
页码:496 / 499
页数:4
相关论文
共 50 条
  • [1] Maximum Likelihood Model Adaptation Using Piecewise Linear Transformation for Robust Speech Recognition
    Lue, Yong
    Wu, Zhenyang
    [J]. ISCE: 2009 IEEE 13TH INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS, VOLS 1 AND 2, 2009, : 479 - 481
  • [2] A Closed-Form Solution of Linear Spectral Transformation for Robust Speech Recognition
    Kim, Donghyun
    Yook, Dongsuk
    [J]. ETRI JOURNAL, 2009, 31 (04) : 454 - 456
  • [3] Maximum conditional mutual information weighted scoring for speech recognition
    Omar, Mohamed Kamal
    Ramaswamy, Ganesh N.
    [J]. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 277 - 280
  • [4] Data-driven temporal filters based on maximum mutual information for robust features in speech recognition
    Huang, YS
    Hung, JW
    [J]. 2004 International Symposium on Chinese Spoken Language Processing, Proceedings, 2004, : 105 - 108
  • [5] UNSEEN NOISE ROBUST SPEECH RECOGNITION USING ADAPTIVE PIECEWISE LINEAR TRANSFORMATION
    Chijiiwa, Keigo
    Suzuki, Masayuki
    Minematsu, Nobuaki
    Hirose, Keikichi
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4289 - 4292
  • [6] A unified spectral transformation adaptation approach for robust speech recognition
    Yao, L
    Yu, D
    Huang, TY
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 981 - 984
  • [7] New cepstral representation using wavelet analysis and spectral transformation for robust speech recognition
    Wassner, H
    Chollet, G
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 260 - 263
  • [8] GENERALIZATION OF TEMPORAL FILTER AND LINEAR TRANSFORMATION FOR ROBUST SPEECH RECOGNITION
    Duc Hoang Ha Nguyen
    Xiao, Xiong
    Chng, Eng Siong
    Li, Haizhou
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [9] Feature extraction using non-linear transformation for robust speech recognition on the AURORA database
    Sharma, S
    Ellis, D
    Kajarekar, S
    Jain, P
    Hermansky, H
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1117 - 1120
  • [10] A TRUST REGION BASED OPTIMIZATION FOR MAXIMUM MUTUAL INFORMATION ESTIMATION OF HMMS IN SPEECH RECOGNITION
    Yan, Zhi-Jie
    Liu, Cong
    Hu, Yu
    Jiang, Hui
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3757 - 3760