Feature Adaptation for Robust Mobile Speech Recognition

被引:0
|
作者
Lee, Hyeopwoo [1 ]
Yook, Dongsuk [1 ]
机构
[1] Korea Univ, Dept Comp & Commun Engn, Speech Informat Proc Lab, Seoul 136701, South Korea
关键词
Speech recognition; speaker adaptation; environment adaptation; feature adaptation; feature space maximum likelihood linear regression (FMLLR); regression tree; LINEAR-REGRESSION; DEVICES;
D O I
10.1109/TCE.2012.6415011
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Feature adaptation such as feature space maximum likelihood linear regression (FMLLR) is useful for robust mobile speech recognition. However, as the amount of adaptation data increases, feature adaptation performance becomes saturated quickly due to its limitation of global transformation. To handle this problem, we propose regression tree based FMLLR which can adopt multiple transformations as the amount of adaptation data increases. An experimental result shows that the proposed method reduces the recognition error by 11.8% further for speaker adaptation task and by 13.6% further for noisy environment adaptation task compared to the conventional method(1).
引用
收藏
页码:1393 / 1398
页数:6
相关论文
共 50 条
  • [31] Histogram Equalization to Model Adaptation for Robust Speech Recognition
    Youngjoo Suh
    Hoirin Kim
    EURASIP Journal on Advances in Signal Processing, 2010
  • [32] Hierarchical stochastic feature matching for robust speech recognition
    Jiang, H
    Soong, F
    Lee, CH
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 217 - 220
  • [33] Approach of feature with confident weight for robust speech recognition
    Ge, YB
    Song, J
    Ge, LN
    Shirai, K
    2004 IEEE 6TH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 2004, : 11 - 14
  • [34] Unsupervised Data-Driven Feature Vector Normalization With Acoustic Model Adaptation for Robust Speech Recognition
    Buera, Luis
    Miguel, Antonio
    Saz, Oscar
    Ortega, Alfonso
    Lleida, Eduardo
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (02): : 296 - 309
  • [35] Combining feature compensation and Weighted Viterbi Decoding for noise robust speech recognition with limited adaptation data
    Cui, XD
    Alwan, A
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 969 - 972
  • [36] An efficient framework for robust mobile speech recognition services
    Rose, RC
    Arizmendi, I
    Parthasarathy, S
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 316 - 319
  • [37] Dual-channel VTS feature compensation for noise-robust speech recognition on mobile devices
    Lopez-Espejo, Ivan
    Peinado, Antonio M.
    Gomez, Angel M.
    Gonzalez, Jose A.
    IET SIGNAL PROCESSING, 2017, 11 (01) : 17 - 25
  • [38] ENSEMBLE FEATURE SELECTION FOR DOMAIN ADAPTATION IN SPEECH EMOTION RECOGNITION
    Abdelwahab, Mohammed
    Busso, Carlos
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5000 - 5004
  • [39] Large-Margin Feature Adaptation for Automatic Speech Recognition
    Cheng, Chih-Chieh
    Sha, Fei
    Saul, Lawrence K.
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 87 - +
  • [40] Speech feature extraction based on wavelet modulation scale for robust speech recognition
    Ma, Xin
    Zhou, Weidong
    Ju, Fang
    Jiang, Qi
    NEURAL INFORMATION PROCESSING, PT 2, PROCEEDINGS, 2006, 4233 : 499 - 505