Boosting systems for large vocabulary continuous speech recognition

被引:14
|
作者
Saon, George [1 ]
Soltau, Hagen [1 ]
机构
[1] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
Speech recognition; Boosting; Acoustic modeling;
D O I
10.1016/j.specom.2011.07.011
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We employ a variant of the popular Adaboost algorithm to train multiple acoustic models such that the aggregate system exhibits improved performance over the individual recognizers. Each model is trained sequentially on re-weighted versions of the training data. At each iteration, the weights are decreased for the frames that are correctly decoded by the current system. These weights are then multiplied with the frame-level statistics for the decision trees and Gaussian mixture components of the next iteration system. The composite system uses a log-linear combination of HMM state observation likelihoods. We report experimental results on several broadcast news transcription setups which differ in the language being spoken (English and Arabic) and amounts of training data. Additionally, we study the impact of boosting on maximum likelihood (ML) and discriminatively trained acoustic models. Our findings suggest that significant gains can be obtained for small amounts of training data even after feature and model-space discriminative training. (C) 2011 Elsevier B.V. All rights reserved.
引用
下载
收藏
页码:212 / 218
页数:7
相关论文
共 50 条
  • [21] SPEECH RECOGNITION FOR LARGE-VOCABULARY SYSTEMS
    JACOB, B
    ANDREOBRECHT, R
    JOURNAL DE PHYSIQUE IV, 1994, 4 (C5): : 489 - 492
  • [22] Towards speech rate independence in large vocabulary continuous speech recognition
    Martinez, F
    Tapias, D
    Alvarez, J
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 725 - 728
  • [23] Parallel Scalability in Speech Recognition Inference engines in large vocabulary continuous speech recognition
    You, Kisun
    Chong, Jike
    Yi, Youngmin
    Gonina, Ekaterina
    Hughes, Christopher J.
    Chen, Yen-Kuang
    Sung, Wonyong
    Keutzer, Kurt
    IEEE SIGNAL PROCESSING MAGAZINE, 2009, 26 (06) : 124 - 135
  • [24] A large vocabulary continuous speech recognition system for Persian language
    Sameti, Hossein
    Veisi, Hadi
    Bahrani, Mohammad
    Babaali, Bagher
    Hosseinzadeh, Khosro
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2011, : 1 - 12
  • [25] A Segmental CRF Approach to Large Vocabulary Continuous Speech Recognition
    Zweig, Geoffrey
    Nguyen, Patrick
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 152 - 157
  • [27] A LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION SYSTEM WITH HIGH PREDICTABILITY
    SHIGENAGA, M
    SEKIGUCHI, Y
    YAMAGUCHI, T
    MASUDA, R
    IEICE TRANSACTIONS ON COMMUNICATIONS ELECTRONICS INFORMATION AND SYSTEMS, 1991, 74 (07): : 1817 - 1825
  • [28] Feature selection in mandarin large vocabulary continuous speech recognition
    Zhu, X
    Chen, YN
    Liu, J
    Liu, RS
    2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 508 - 511
  • [29] DISTRIBUTED SUBMODULAR MAXIMIZATION FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    Qi, Jun
    Liu, Xu
    Kamijo, Shunshuke
    Tejedor, Javier
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2501 - 2505
  • [30] Using a transcription graph for large vocabulary continuous speech recognition
    Li, Z
    OShaughnessy, D
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 121 - 124