On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition

被引:21
|
作者
Zeng, Zhiping [1 ]
Khassanov, Yerbolat [2 ]
Van Tung Pham [1 ]
Xu, Haihua [1 ]
Chng, Eng Siong [1 ,2 ]
Li, Haizhou [3 ]
机构
[1] Nanyang Technol Univ, Temasek Labs, Singapore, Singapore
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
[3] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
来源
关键词
Code-switching; speech recognition; end-to-end; multitask learning; language identification;
D O I
10.21437/Interspeech.2019-1429
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Code-switching (CS) refers to a linguistic phenomenon where a speaker uses different languages in an utterance or between alternating utterances. In this work, we study end-to-end (E2E) approaches to the Mandarin-English code-switching speech recognition task. We first examine the effectiveness of using data augmentation and byte-pair encoding (BPE) subword units. More importantly, we propose a multitask learning recipe, where a language identification task is explicitly learned in addition to the E2E speech recognition task. Furthermore, we introduce an efficient word vocabulary expansion method for language modeling to alleviate data sparsity issues under the code-switching scenario. Experimental results on the SEAME data, a Mandarin-English code-switching corpus, demonstrate the effectiveness of the proposed methods.
引用
收藏
页码:2165 / 2169
页数:5
相关论文
共 50 条
  • [1] INVESTIGATING END-TO-END SPEECH RECOGNITION FOR MANDARIN-ENGLISH CODE-SWITCHING
    Shan, Changhao
    Weng, Chao
    Wang, Guangsen
    Su, Dan
    Luo, Min
    Yu, Dong
    Xie, Lei
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6056 - 6060
  • [2] Integrating Knowledge in End-to-End Automatic Speech Recognition for Mandarin-English Code-Switching
    Li, Chia-Yu
    Ngoc Thang Vu
    [J]. PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 160 - 165
  • [3] Rnn-transducer With Language Bias For End-to-end Mandarin-English Code-switching Speech Recognition
    Zhang, Shuai
    Yi, Jiangyan
    Tian, Zhengkun
    Tao, Jianhua
    Bai, Ye
    [J]. 2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [4] Mandarin-English Code-switching Speech Recognition
    Xu, Haihua
    Van Tung Pham
    Kyaw, Zin Tun
    Lim, Zhi Hao
    Chng, Eng Siong
    Li, Haizhou
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 554 - 555
  • [5] Pronunciation augmentation for Mandarin-English code-switching speech recognition
    Yanhua Long
    Shuang Wei
    Jie Lian
    Yijie Li
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [6] Pronunciation augmentation for Mandarin-English code-switching speech recognition
    Long, Yanhua
    Wei, Shuang
    Lian, Jie
    Li, Yijie
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [7] NON-AUTOREGRESSIVE MANDARIN-ENGLISH CODE-SWITCHING SPEECH RECOGNITION
    Chuang, Shun-Po
    Chang, Heng-Jui
    Huang, Sung-Feng
    Lee, Hung-yi
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 465 - 472
  • [8] Acoustic data augmentation for Mandarin-English code-switching speech recognition
    Long, Yanhua
    Li, Yijie
    Zhang, Qiaozheng
    Wei, Shuang
    Ye, Hong
    Yang, Jichen
    [J]. APPLIED ACOUSTICS, 2020, 161
  • [9] Cyclic Transfer Learning for Mandarin-English Code-Switching Speech Recognition
    Nga, Cao Hong
    Vu, Duc-Quang
    Luong, Huong Hoang
    Huang, Chien-Lin
    Wang, Jia-Ching
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1387 - 1391
  • [10] ADDRESSING ACCENT MISMATCH IN MANDARIN-ENGLISH CODE-SWITCHING SPEECH RECOGNITION
    Tan, Zhili
    Fan, Xinghua
    Zhu, Hui
    Lin, Ed
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8259 - 8263