Improved Deliberation Network with Text Pre-training for Code-Switching Automatic Speech Recognition

被引:0
|
作者
Shen, Zhijie [1 ]
Guo, Wu [1 ]
机构
[1] Univ Sci & Technol China, Dept Elect Engn & Informat Sci EEIS, Hefei, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
automatic speech recognition; code-switching; deliberation network; text pre-training;
D O I
10.21437/Interspeech.2022-221
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes an improved deliberation network (DN) for end-to-end code-switching (CS) automatic speech recognition (ASR). In a conventional DN, acoustic encoding and first-pass hypothesis encoding are utilized separately and are simply combined by summation, which cannot take full advantage of their potential complementarity. Hence, the proposed improved DN model exploits the relationship between the two encodings through a two-staged process. First, by integrating the two encodings into a unified semantic space through a shared encoder, and second, by capturing the relevant information from the acoustic encoding through an attention mechanism before the final decoding process. Moreover, the lack of paired training data restricts the generalization ability of the model in CS ASR. To address this problem, the developed DN is pre-trained based on a denoising sequence-to-sequence (seq2seq) objective using unpaired text data. Experiments on a Chinese-English CS dataset demonstrate the effectiveness of the proposed method. Compared with the conventional DN, a 13.5% relative error rate reduction is observed.
引用
收藏
页码:3854 / 3858
页数:5
相关论文
共 50 条
  • [21] Investigating Multi-task Learning for Automatic Speech Recognition with Code-switching between Mandarin and English
    Song, Xiao
    Zou, Yuexian
    Huang, Shilei
    Chen, Shaobin
    Liu, Yi
    [J]. 2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 27 - 30
  • [22] Integrating Knowledge in End-to-End Automatic Speech Recognition for Mandarin-English Code-Switching
    Li, Chia-Yu
    Ngoc Thang Vu
    [J]. PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 160 - 165
  • [23] Swahili Speech Dataset Development and Improved Pre-training Method for Spoken Digit Recognition
    Kivaisi, Alexander R.
    Zhao, Qingjie
    Mbelwa, Jimmy T.
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (07)
  • [24] Improving code-switching speech recognition with data augmentation and system combination
    Ma, Duo
    Xu, Haihua
    Li, Guanyu
    Chng, Eng Siong
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1308 - 1312
  • [25] DATA AUGMENTATION FOR END-TO-END CODE-SWITCHING SPEECH RECOGNITION
    Du, Chenpeng
    Li, Hao
    Lu, Yizhou
    Wang, Lan
    Qian, Yanmin
    [J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 194 - 200
  • [26] Language-specific Characteristic Assistance for Code-switching Speech Recognition
    Song, Tongtong
    Xu, Qiang
    Ge, Meng
    Wang, Longbiao
    Shi, Hao
    Lv, Yongjie
    Lin, Yuqin
    Dang, Jianwu
    [J]. INTERSPEECH 2022, 2022, : 3924 - 3928
  • [27] Pronunciation augmentation for Mandarin-English code-switching speech recognition
    Long, Yanhua
    Wei, Shuang
    Lian, Jie
    Li, Yijie
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [28] Multi-Encoder-Decoder Transformer for Code-Switching Speech Recognition
    Zhou, Xinyuan
    Yilmaz, Emre
    Long, Yanhua
    Li, Yijie
    Li, Haizhou
    [J]. INTERSPEECH 2020, 2020, : 1042 - 1046
  • [29] Pronunciation augmentation for Mandarin-English code-switching speech recognition
    Yanhua Long
    Shuang Wei
    Jie Lian
    Yijie Li
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [30] Semi-supervised acoustic model training for speech with code-switching
    Yilmaz, Emre
    McLaren, Mitchell
    van den Heuvel, Henk
    van Leeuwen, David A.
    [J]. SPEECH COMMUNICATION, 2018, 105 : 12 - 22