Discriminative incorporation of explicitly trained tone models into lattice based rescoring for Mandarin speech recognition

被引:0
|
作者
Huang, Hao [1 ]
Zhu, Jie [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai 200240, Peoples R China
关键词
explicit tone model incorporation; minimum phone error; discriminative training; Mandarin speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Explicit tone modeling has been widely discussed in recent Mandarin speech recognition research. In this paper, a discriminative method of incorporating explicitly trained tone models into lattice based rescoring is proposed. The method is to use discriminative trained model weights to scale the acoustic model and tone model distributions. The weights are trained by the Minimum Phone Error using the Extended Baum Welch algorithm. To take into account different phonetic contexts, various model weighting schemes are evaluated. A smoothing technique is introduced to make model weight training more robust to over fitting. The proposed method is evaluated on tonal syllable output speech recognition tasks on a Mandarin LVCSR database. Results show the proposed method has achieved significant error reduction than traditional global weight approach. Comparison with the traditional embedded tone modeling is also made, which shows the importance of the proposed method when explicit tone modeling approach is applied.
引用
收藏
页码:1541 / 1544
页数:4
相关论文
共 50 条
  • [1] Improved mandarin speech recognition by lattice rescoring with enhanced tone models
    Wang, Huanliang
    Qian, Yao
    Soong, Frank
    Zhou, Jian-Lai
    Han, Jiqing
    [J]. CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 445 - +
  • [2] Scaling Laws for Discriminative Speech Recognition Rescoring Models
    Gu, Yile
    Shivakumar, Prashanth Gurunath
    Kolehmainen, Jari
    Gandhe, Ankur
    Rastrow, Ariya
    Bulyko, Ivan
    [J]. INTERSPEECH 2023, 2023, : 471 - 475
  • [3] Personalization for BERT-based Discriminative Speech Recognition Rescoring
    Kolehmainen, Jari
    Gu, Yile
    Gourav, Aditya
    Shivakumar, Prashanth Gurunath
    Gandhe, Ankur
    Rastrow, Ariya
    Bulyko, Ivan
    [J]. INTERSPEECH 2023, 2023, : 366 - 370
  • [4] RESCOREBERT: DISCRIMINATIVE SPEECH RECOGNITION RESCORING WITH BERT
    Xu, Liyan
    Gu, Yile
    Kolehmainen, Jari
    Khan, Haidar
    Gandhe, Ankur
    Rastrow, Ariya
    Stoleke, Andreas
    Bulyko, Ivan
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6117 - 6121
  • [5] ATTRIBUTE BASED LATTICE RESCORING IN SPONTANEOUS SPEECH RECOGNITION
    Chen, I-Fan
    Siniscalchi, Sabato Marco
    Lee, Chin-Hui
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [6] MAXIMUM ENTROPY BASED TONE MODELING FOR MANDARIN SPEECH RECOGNITION
    Wang, Xinhao
    Yu, Yansuo
    Wu, Xihong
    Chi, Huisheng
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4850 - 4853
  • [7] A TONE RECOGNITION FRAMEWORK FOR CONTINUOUS MANDARIN SPEECH
    He, Lei
    Hao, Jie
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1575 - 1578
  • [8] Tone Modeling for Continuous Mandarin Speech Recognition
    Cao, Yang
    Zhang, Shuwu
    Huang, Taiyi
    Xu, Bo
    [J]. International Journal of Speech Technology, 2004, 7 (2-3) : 115 - 128
  • [9] LATTICE RESCORING STRATEGIES FOR LONG SHORT TERM MEMORY LANGUAGE MODELS IN SPEECH RECOGNITION
    Kumar, Shankar
    Nirschl, Michael
    Holtmann-Rice, Daniel
    Liao, Hank
    Suresh, Ananda Theertha
    Yu, Felix
    [J]. 2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 165 - 172
  • [10] Tone recognition of continuous Mandarin speech based on tone nucleus model and neural network
    Wang, Xiao-Dong
    Hirose, Keikichi
    Zhang, Jin-Song
    Minematsu, Nobuaki
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (06) : 1748 - 1755