Finding Complex Features for Guest Language Fragment Recovery in Resource-Limited Code-Mixed Speech Recognition

被引:2
|
作者
Heidel, Aaron [1 ]
Lu, Hsiang-Hung [2 ]
Lee, Lin-Shan [2 ]
机构
[1] Natl Taiwan Univ, Dept Comp Sci & Informat Engn, Taipei 10617, Taiwan
[2] Natl Taiwan Univ, Dept Elect Engn, Taipei 10617, Taiwan
关键词
Bilingual; code-mixing; language identification; speech recognition; WORDS;
D O I
10.1109/TASLP.2015.2469634
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The rise of mobile devices and online learning brings into sharp focus the importance of speech recognition not only for the many languages of the world but also for code-mixed speech, especially where English is the second language. The recognition of code-mixed speech, where the speaker mixes languages within a single utterance, is a challenge for both computers and humans, not least because of the limited training data. We conduct research on a Mandarin-English code-mixed lecture corpus, where Mandarin is the host language and English the guest language, and attempt to find complex features for the recovery of English segments that were misrecognized in the initial recognition pass. We propose a multi-level framework wherein both low-level and high-level cues are jointly considered; we use phonotactic, prosodic, and linguistic cues in addition to acoustic-phonetic cues to discriminate at the frame level between English-and Chinese-language segments. We develop a simple and exact method for CRF feature induction, and improved methods for using cascaded features derived from the training corpus. By additionally tuning the data imbalance ratio between English and Chinese, we demonstrate highly significant improvements over previous work in the recovery of English-language segments, and demonstrate performance superior to DNN-based methods. We demonstrate considerable performance improvements not only with the traditional GMM-HMM recognition paradigm but also with a state-of-the-art hybrid CD-HMM-DNN recognition framework.
引用
收藏
页码:2148 / 2161
页数:14
相关论文
共 2 条
  • [1] RECOGNITION OF HIGHLY IMBALANCED CODE-MIXED BILINGUAL SPEECH WITH FRAME-LEVEL LANGUAGE DETECTION BASED ON BLURRED POSTERIORGRAM
    Yeh, Ching-Feng
    Heidel, Aaron
    Lee, Hong-Yi
    Lee, Lin-Shan
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4873 - 4876
  • [2] RECOGNITION OF HIGHLY IMBALANCED CODE-MIXED BILINGUAL SPEECH WITH FRAME-LEVEL LANGUAGE DETECTION BASED ON BLURRED POSTERIORGRAM
    Yeh, Ching-Feng
    Heidel, Aaron
    Lee, Hong-Yi
    Lee, Lin-Shan
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4873 - 4876