DATA SAMPLING ENSEMBLE ACOUSTIC MODELLING IN SPEAKER INDEPENDENT SPEECH RECOGNITION

被引:2
|
作者
Chen, Xin [1 ]
Zhao, Yunxin [1 ]
机构
[1] Univ Missouri, Dept Comp Sci, Columbia, MO 65211 USA
关键词
ensemble acoustic modeling; recurrent neural network; speaker overlapped clustering; data sampling; speaker adaptation;
D O I
10.1109/ICASSP.2010.5495029
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we extend our recent data-sampling based ensemble acoustic modeling technique for the speaker-independent task of TIMIT and propose new methods to further improve the effectiveness of the ensemble acoustic models. We propose applying overlapped speaker clustering in data sampling to construct an ensemble of acoustic models for speaker independent speech recognition. In addition, we evaluate the method of data sampling in recurrent neural network for constructing a RNN based frame classifier. We also investigate using CVEM in place of EM in our ensemble acoustic model training. By using these methods on the speaker independent TIMIT phone recognition task, we have obtained a 2.5% absolute gain on phone accuracy over a standard HMM baseline system.
引用
收藏
页码:5130 / 5133
页数:4
相关论文
共 50 条
  • [1] Speaker independent speech emotion recognition by ensemble classification
    Schuller, B
    Reiter, S
    Müller, R
    Al-Hames, M
    Lang, M
    Rigoll, G
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2, 2005, : 865 - 868
  • [2] DATA SAMPLING BASED ENSEMBLE ACOUSTIC MODELLING
    Chen, Xin
    Zhao, Yunxin
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3805 - 3808
  • [3] Acoustic-phonetic speech parameters for speaker-independent speech recognition
    Deshmukh, O
    Espy-Wilson, CY
    Juneja, A
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 593 - 596
  • [4] ACOUSTIC MODELING OF SUBWORD UNITS FOR LARGE VOCABULARY SPEAKER INDEPENDENT SPEECH RECOGNITION
    LEE, CH
    RABINER, LR
    PIERACCINI, R
    WILPON, JG
    [J]. SPEECH AND NATURAL LANGUAGE, 1989, : 280 - 291
  • [5] Acoustic training system for speaker independent continuous arabic speech recognition system
    Nofal, M
    Abdel-Raheem, E
    El Henawy, H
    Kader, NA
    [J]. Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004, : 200 - 203
  • [6] Large Vocabulary Speech Recognition: Speaker Dependent and Speaker Independent
    Hemakumar, G.
    Punitha, P.
    [J]. INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS, VOL 1, 2015, 339 : 73 - 80
  • [7] Comparison of text-independent speaker recognition methods on telephone speech with acoustic mismatch
    vanVuuren, S
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1788 - 1791
  • [8] An Acoustic-Phonetic-Based Speaker Adaptation Technique for Improving Speaker-Independent Continuous Speech Recognition
    Zhao, Yunxin
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (03): : 380 - 394
  • [9] Unsupervised speaker adaptation for speaker independent acoustic to articulatory speech inversion
    Sivaraman, Ganesh
    Mitra, Vikramjit
    Nam, Hosung
    Tiede, Mark
    Espy-Wilson, Carol
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 146 (01): : 316 - 329
  • [10] Biomimetic pattern recognition for speaker-independent speech recognition
    Qin, H
    Wang, SJ
    Sun, H
    [J]. PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : 1290 - 1294