ADVERSARIAL TRAINING OF END-TO-END SPEECH RECOGNITION USING A CRITICIZING LANGUAGE MODEL

被引:0
|
作者
Liu, Alexander H. [1 ]
Lee, Hung-yi [1 ]
Lee, Lin-shan [1 ]
机构
[1] Natl Taiwan Univ, Coll Elect Engn & Comp Sci, Taipei, Taiwan
关键词
automatic speech recognition; end-to-end; adversarial training; criticizing language model;
D O I
10.1109/icassp.2019.8683602
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we proposed a novel Adversarial Training ( AT) approach for end-to-end speech recognition using a Criticizing Language Model ( CLM). In this way the CLM and the automatic speech recognition ( ASR) model can challenge and learn from each other iteratively to improve the performance. Since the CLM only takes the text as input, huge quantities of unpaired text data can be utilized in this approach within end-to-end training. Moreover, AT can be applied to any end-to-end ASR model using any deep-learning-based language modeling frameworks, and compatible with any existing end-to-end decoding method. Initial results with an example experimental setup demonstrated the proposed approach is able to gain consistent improvements efficiently from auxiliary text data under different scenarios.
引用
收藏
页码:6176 / 6180
页数:5
相关论文
共 50 条
  • [31] Improved training of end-to-end attention models for speech recognition
    Zeyer, Albert
    Irie, Kazuki
    Schlueter, Ralf
    Ney, Hermann
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 7 - 11
  • [32] Serialized Output Training for End-to-End Overlapped Speech Recognition
    Kanda, Naoyuki
    Gaur, Yashesh
    Wang, Xiaofei
    Meng, Zhong
    Yoshioka, Takuya
    [J]. INTERSPEECH 2020, 2020, : 2797 - 2801
  • [33] SEQUENCE TRAINING OF ENCODER-DECODER MODEL USING POLICY GRADIENT FOR END-TO-END SPEECH RECOGNITION
    Karita, Shigeki
    Ogawa, Atsunori
    Delcroix, Marc
    Nakatani, Tomohiro
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5839 - 5843
  • [34] Hybrid end-to-end model for Kazakh speech recognition
    Mamyrbayev O.Z.
    Oralbekova D.O.
    Alimhan K.
    Nuranbayeva B.M.
    [J]. International Journal of Speech Technology, 2023, 26 (02) : 261 - 270
  • [35] A SPELLING CORRECTION MODEL FOR END-TO-END SPEECH RECOGNITION
    Guo, Jinxi
    Sainath, Tara N.
    Weiss, Ron J.
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5651 - 5655
  • [36] MULTILINGUAL SPEECH RECOGNITION WITH A SINGLE END-TO-END MODEL
    Toshniwal, Shubham
    Sainath, Tara N.
    Weiss, Ron J.
    Li, Bo
    Moreno, Pedro
    Weinstein, Eugene
    Rao, Kanishka
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4904 - 4908
  • [37] Speech Corpus of Ainu Folklore and End-to-end Speech Recognition for Ainu Language
    Matsuura, Kohei
    Ueno, Sei
    Mimura, Masato
    Sakai, Shinsuke
    Kawahara, Tatsuya
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2622 - 2628
  • [38] A DENSITY RATIO APPROACH TO LANGUAGE MODEL FUSION IN END-TO-END AUTOMATIC SPEECH RECOGNITION
    McDermott, Erik
    Sak, Hasim
    Variani, Ehsan
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 434 - 441
  • [39] INTERNAL LANGUAGE MODEL ESTIMATION FOR DOMAIN-ADAPTIVE END-TO-END SPEECH RECOGNITION
    Meng, Zhong
    Parthasarathy, Sarangarajan
    Sun, Eric
    Gaur, Yashesh
    Kanda, Naoyuki
    Lu, Liang
    Chen, Xie
    Zhao, Rui
    Li, Jinyu
    Gong, Yifan
    [J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 243 - 250
  • [40] LANGUAGE INDEPENDENT END-TO-END ARCHITECTURE FOR JOINT LANGUAGE IDENTIFICATION AND SPEECH RECOGNITION
    Watanabe, Shinji
    Hori, Takaaki
    Hershey, John R.
    [J]. 2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 265 - 271