SELF-CRITICAL SEQUENCE TRAINING FOR AUTOMATIC SPEECH RECOGNITION

被引:4
|
作者
Chen, Chen [1 ]
Hu, Yuchen [1 ]
Hou, Nana [1 ]
Qi, Xiaofeng [1 ]
Zou, Heqing [1 ]
Chng, Eng Siong [1 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
基金
新加坡国家研究基金会;
关键词
Automatic Speech Recognition; Reinforcement leaning;
D O I
10.1109/ICASSP43922.2022.9746668
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Although automatic speech recognition (ASR) task has gained remarkable success by sequence-to-sequence models, there are two main mismatches between its training and testing that might lead to performance degradation: 1) The typically used cross-entropy criterion aims to maximize log-likelihood of the training data, while the performance is evaluated by word error rate (WER), not log-likelihood; 2) The teacher-forcing method leads to the dependence on ground truth during training, which means that model has never been exposed to its own prediction before testing. In this paper, we propose an optimization method called self-critical sequence training (SCST) to make the training procedure much closer to the testing phase. As a reinforcement learning (RL) based method, SCST utilizes a customized reward function to associate the training criterion and WER. Furthermore, it removes the reliance on teacher-forcing and harmonizes the model with respect to its inference procedure. We conducted experiments on both clean and noisy speech datasets, and the results show that the proposed SCST respectively achieves 8.7% and 7.8% relative improvements over the baseline in terms of WER.
引用
收藏
页码:3688 / 3692
页数:5
相关论文
共 50 条
  • [31] Noise Adaptive Training for Robust Automatic Speech Recognition
    Kalinli, Ozlem
    Seltzer, Michael L.
    Droppo, Jasha
    Acero, Alex
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (08): : 1889 - 1901
  • [32] Critical success factors for automatic speech recognition in the classroom
    Bennett, Steve
    Hewitt, Jill
    Mellor, Barry
    Lyon, Caroline
    UNIVERSAL ACCESS IN HUMAN-COMPUTER INTERACTION: APPLICATIONS AND SERVICES, PT 3, PROCEEDINGS, 2007, : 224 - +
  • [33] COUPLED TRAINING OF SEQUENCE-TO-SEQUENCE MODELS FOR ACCENTED SPEECH RECOGNITION
    Unni, Vinit
    Joshi, Nitish
    Jyothi, Preethi
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8254 - 8258
  • [34] NATURALISM - SELF-CONSCIOUS AND SELF-CRITICAL
    KLAUSNER, NW
    REVIEW OF METAPHYSICS, 1962, 15 (03): : 480 - 493
  • [36] SELF-CRITICAL COMMUNIST ROBERT ROSKO
    Kmet, Norbert
    HISTORICKY CASOPIS, 2022, 70 (01): : 153 - 171
  • [37] SELF-CRITICAL AND DEPENDENT ASPECTS OF LONELINESS
    SCHACHTER, EP
    ZLOTOGORSKI, Z
    ISRAEL JOURNAL OF PSYCHIATRY AND RELATED SCIENCES, 1995, 32 (03): : 205 - 211
  • [38] ENCOURAGING SCIENTISTS TO BE MORE SELF-CRITICAL
    JONES, TM
    CHEMISTRY IN BRITAIN, 1976, 12 (10) : 328 - 328
  • [39] AUTONOMIC RESPONSE TO SELF-CRITICAL THOUGHT
    SCHUELE, JG
    WIESENFELD, AR
    COGNITIVE THERAPY AND RESEARCH, 1983, 7 (02) : 189 - 194
  • [40] CODIFYING A PRIVILEGE FOR SELF-CRITICAL ANALYSIS
    LEONARD, DP
    HARVARD JOURNAL ON LEGISLATION, 1988, 25 (01) : 113 - 152