ADVERSARIAL TRAINING OF END-TO-END SPEECH RECOGNITION USING A CRITICIZING LANGUAGE MODEL

被引:0
|
作者
Liu, Alexander H. [1 ]
Lee, Hung-yi [1 ]
Lee, Lin-shan [1 ]
机构
[1] Natl Taiwan Univ, Coll Elect Engn & Comp Sci, Taipei, Taiwan
关键词
automatic speech recognition; end-to-end; adversarial training; criticizing language model;
D O I
10.1109/icassp.2019.8683602
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we proposed a novel Adversarial Training ( AT) approach for end-to-end speech recognition using a Criticizing Language Model ( CLM). In this way the CLM and the automatic speech recognition ( ASR) model can challenge and learn from each other iteratively to improve the performance. Since the CLM only takes the text as input, huge quantities of unpaired text data can be utilized in this approach within end-to-end training. Moreover, AT can be applied to any end-to-end ASR model using any deep-learning-based language modeling frameworks, and compatible with any existing end-to-end decoding method. Initial results with an example experimental setup demonstrated the proposed approach is able to gain consistent improvements efficiently from auxiliary text data under different scenarios.
引用
收藏
页码:6176 / 6180
页数:5
相关论文
共 50 条
  • [41] SPEAKER VERIFICATION USING END-TO-END ADVERSARIAL LANGUAGE ADAPTATION
    Rohdin, Johan
    Stafylakis, Themos
    Silnova, Anna
    Zeinali, Hossein
    Burget, Lukas
    Plchot, Oldrich
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6006 - 6010
  • [42] AIPNET: GENERATIVE ADVERSARIAL PRE-TRAINING OF ACCENT-INVARIANT NETWORKS FOR END-TO-END SPEECH RECOGNITION
    Chen, Yi-Chen
    Yang, Zhaojun
    Yeh, Ching-Feng
    Jain, Mahaveer
    Seltzer, Michael L.
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6979 - 6983
  • [43] Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
    Zhang, C.
    Li, B.
    Sainath, T. N.
    Strohman, T.
    Mavandadi, S.
    Chang, S.
    Haghani, P.
    [J]. INTERSPEECH 2022, 2022, : 3223 - 3227
  • [44] END-TO-END MULTIMODAL SPEECH RECOGNITION
    Palaskar, Shruti
    Sanabria, Ramon
    Metze, Florian
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5774 - 5778
  • [45] End-to-End Speech Recognition in Russian
    Markovnikov, Nikita
    Kipyatkova, Irina
    Lyakso, Elena
    [J]. SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 377 - 386
  • [46] Multichannel End-to-end Speech Recognition
    Ochiai, Tsubasa
    Watanabe, Shinji
    Hori, Takaaki
    Hershey, John R.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [47] End-to-end Accented Speech Recognition
    Viglino, Thibault
    Motlicek, Petr
    Cernak, Milos
    [J]. INTERSPEECH 2019, 2019, : 2140 - 2144
  • [48] END-TO-END AUDIOVISUAL SPEECH RECOGNITION
    Petridis, Stavros
    Stafylakis, Themos
    Ma, Pingchuan
    Cai, Feipeng
    Tzimiropoulos, Georgios
    Pantic, Maja
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6548 - 6552
  • [49] Overview of end-to-end speech recognition
    Wang, Song
    Li, Guanyu
    [J]. 2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018), 2019, 1187
  • [50] END-TO-END ANCHORED SPEECH RECOGNITION
    Wang, Yiming
    Fan, Xing
    Chen, I-Fan
    Liu, Yuzong
    Chen, Tongfei
    Hoffmeister, Bjorn
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7090 - 7094