A FIRST SPEECH RECOGNITION SYSTEM FOR MANDARIN-ENGLISH CODE-SWITCH CONVERSATIONAL SPEECH

被引:0
|
作者
Ngoc Thang Vu [1 ]
Lyu, Dau-Cheng [2 ]
Weiner, Jochen [1 ]
Telaar, Dominic [1 ]
Schlippe, Tim [1 ]
Blaicher, Fabian [1 ]
Chng, Eng-Siong [2 ]
Schultz, Tanja [1 ]
Li, Haizhou [2 ]
机构
[1] KIT, Inst Anthropomat, Cognit Syst Lab, Karlsruhe, Germany
[2] Nanyang Technol Univ, Sch Comp Engn, Singapore 639798, Singapore
关键词
code-switching; multilingual speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents first steps toward a large vocabulary continuous speech recognition system (LVCSR) for conversational Mandarin-English code-switching (CS) speech. We applied state-of-the-art techniques such as speaker adaptive and discriminative training to build the first baseline system on the SEAME corpus [1] (South East Asia Mandarin-English). For acoustic modeling, we applied different phone merging approaches based on the International Phonetic Alphabet (IPA) and Bhattacharyya distance in combination with discriminative training to improve accuracy. On language model level, we investigated statistical machine translation (SMT) based text generation approaches for building code-switching language models. Furthermore, we integrated the provided information from a language identification system (LID) into the decoding process by using a multi-stream approach. Our best 2-pass system achieves a Mixed Error Rate (MER) of 36.6% on the SEAME development set.
引用
收藏
页码:4889 / 4892
页数:4
相关论文
共 50 条
  • [1] A FIRST SPEECH RECOGNITION SYSTEM FOR MANDARIN-ENGLISH CODE-SWITCH CONVERSATIONAL SPEECH
    Ngoc Thang Vu
    Lyu, Dau-Cheng
    Weiner, Jochen
    Telaar, Dominic
    Schlippe, Tim
    Blaicher, Fabian
    Chng, Eng-Siong
    Schultz, Tanja
    Li, Haizhou
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4889 - 4892
  • [2] Mandarin-English Code-switching Speech Recognition
    Xu, Haihua
    Van Tung Pham
    Kyaw, Zin Tun
    Lim, Zhi Hao
    Chng, Eng Siong
    Li, Haizhou
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 554 - 555
  • [3] LANGUAGE DIARIZATION FOR CODE-SWITCH CONVERSATIONAL SPEECH
    Lyu, Dau-Cheng
    Chng, Eng-Siong
    Li, Haizhou
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7314 - 7318
  • [4] Pronunciation augmentation for Mandarin-English code-switching speech recognition
    Yanhua Long
    Shuang Wei
    Jie Lian
    Yijie Li
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [5] Pronunciation augmentation for Mandarin-English code-switching speech recognition
    Long, Yanhua
    Wei, Shuang
    Lian, Jie
    Li, Yijie
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [6] Acoustic data augmentation for Mandarin-English code-switching speech recognition
    Long, Yanhua
    Li, Yijie
    Zhang, Qiaozheng
    Wei, Shuang
    Ye, Hong
    Yang, Jichen
    [J]. APPLIED ACOUSTICS, 2020, 161
  • [7] NON-AUTOREGRESSIVE MANDARIN-ENGLISH CODE-SWITCHING SPEECH RECOGNITION
    Chuang, Shun-Po
    Chang, Heng-Jui
    Huang, Sung-Feng
    Lee, Hung-yi
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 465 - 472
  • [8] Cyclic Transfer Learning for Mandarin-English Code-Switching Speech Recognition
    Nga, Cao Hong
    Vu, Duc-Quang
    Luong, Huong Hoang
    Huang, Chien-Lin
    Wang, Jia-Ching
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1387 - 1391
  • [9] ADDRESSING ACCENT MISMATCH IN MANDARIN-ENGLISH CODE-SWITCHING SPEECH RECOGNITION
    Tan, Zhili
    Fan, Xinghua
    Zhu, Hui
    Lin, Ed
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8259 - 8263
  • [10] On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition
    Zeng, Zhiping
    Khassanov, Yerbolat
    Van Tung Pham
    Xu, Haihua
    Chng, Eng Siong
    Li, Haizhou
    [J]. INTERSPEECH 2019, 2019, : 2165 - 2169