Acoustic data augmentation for Mandarin-English code-switching speech recognition

被引:20
|
作者
Long, Yanhua [1 ]
Li, Yijie [2 ]
Zhang, Qiaozheng [1 ]
Wei, Shuang [1 ]
Ye, Hong [1 ]
Yang, Jichen [3 ]
机构
[1] Shanghai Normal Univ, SHNU Unisound Joint Lab Nat Human Comp Interact, Shanghai, Peoples R China
[2] Unisound AI Technol Co Ltd, Beijing, Peoples R China
[3] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
基金
中国国家自然科学基金;
关键词
Data augmentation; Code-switching; Acoustic event detection; Speech recognition; NEURAL-NETWORKS;
D O I
10.1016/j.apacoust.2019.107175
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Code-switching (CS) is a multilingual phenomenon where a speaker uses different languages in an utterance or between alternating utterances. Developing large-scale datasets for training code-switching acoustic and language models is challenging and extremely expensive. In this paper, we focus on the acoustic data augmentation for the Mandarin-English CS speech recognition task. Effectiveness of conventional acoustic data augmentation approaches are examined. More importantly, we propose a CS acoustic event detection system based on the deep neural network to extract real code-switching speech segments automatically. Then, the semi-supervised and active learning techniques are investigated to generate transcriptions of these segments. Finally, code-switching speech synthesis system is introduced to further enhance the acoustic modeling. Experimental results on the OC16-CE80 data, a Mandarin English mixlingual speech corpus, demonstrate the effectiveness of the proposed methods. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Pronunciation augmentation for Mandarin-English code-switching speech recognition
    Long, Yanhua
    Wei, Shuang
    Lian, Jie
    Li, Yijie
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [2] Pronunciation augmentation for Mandarin-English code-switching speech recognition
    Yanhua Long
    Shuang Wei
    Jie Lian
    Yijie Li
    EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [3] Mandarin-English Code-switching Speech Recognition
    Xu, Haihua
    Van Tung Pham
    Kyaw, Zin Tun
    Lim, Zhi Hao
    Chng, Eng Siong
    Li, Haizhou
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 554 - 555
  • [4] NON-AUTOREGRESSIVE MANDARIN-ENGLISH CODE-SWITCHING SPEECH RECOGNITION
    Chuang, Shun-Po
    Chang, Heng-Jui
    Huang, Sung-Feng
    Lee, Hung-yi
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 465 - 472
  • [5] Cyclic Transfer Learning for Mandarin-English Code-Switching Speech Recognition
    Nga, Cao Hong
    Vu, Duc-Quang
    Luong, Huong Hoang
    Huang, Chien-Lin
    Wang, Jia-Ching
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1387 - 1391
  • [6] ADDRESSING ACCENT MISMATCH IN MANDARIN-ENGLISH CODE-SWITCHING SPEECH RECOGNITION
    Tan, Zhili
    Fan, Xinghua
    Zhu, Hui
    Lin, Ed
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8259 - 8263
  • [7] Language-specific Acoustic Boundary Learning for Mandarin-English Code-switching Speech Recognition
    Fan, Zhiyun
    Dong, Linhao
    Shen, Chen
    Liang, Zhenlin
    Zhang, Jun
    Lu, Lu
    Ma, Zejun
    INTERSPEECH 2023, 2023, : 3322 - 3326
  • [8] On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition
    Zeng, Zhiping
    Khassanov, Yerbolat
    Van Tung Pham
    Xu, Haihua
    Chng, Eng Siong
    Li, Haizhou
    INTERSPEECH 2019, 2019, : 2165 - 2169
  • [9] INVESTIGATING END-TO-END SPEECH RECOGNITION FOR MANDARIN-ENGLISH CODE-SWITCHING
    Shan, Changhao
    Weng, Chao
    Wang, Guangsen
    Su, Dan
    Luo, Min
    Yu, Dong
    Xie, Lei
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6056 - 6060
  • [10] A Mandarin-English Code-Switching Corpus
    Li, Ying
    Yu, Yue
    Fung, Pascale
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 2515 - 2519