A Review of the Mandarin-English Code-switching Corpus: SEAME

被引:0
|
作者
Lee, Grandee [1 ]
Ho, Thi-Nga [2 ]
Chng, Eng-Siong [2 ]
Li, Haizhou [1 ]
机构
[1] Natl Univ Singapore, Elect & Comp Engn Dept, Singapore 117583, Singapore
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
关键词
Code-switching corpus; Mandarin English corpus; SEAME;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we report the development of the South East Asia Mandarin-English (SEAME) corpus, including 63 hours of transcribed spontaneous Mandarin English code-switching speech in its first release, and an update of additional 129 transcribed hours of speech. The corpus was developed for code-switching speech recognition research, such as LVCSR, language recognition, and language segmentation. It was made publicly available through LDC since 2015. The corpus was recorded under unscripted interview and conversation settings, therefore, consisting of spontaneous speech. This paper seeks to present a comprehensive statistics and analysis of the corpus after the update in term of its composition, speaker profile and code-switch characteristics. This paper will also review its suitability for various code-switch related researches and possible further developments.
引用
收藏
页码:210 / 213
页数:4
相关论文
共 50 条
  • [41] Code-Switching in The Malaysian Hansard Corpus: A Corpus-Based Approach
    Izam, Muhammad Zakwan Mohd
    Maros, Marlyna
    Jaludin, Azhar
    Abdullah, Imran Ho
    GEMA ONLINE JOURNAL OF LANGUAGE STUDIES, 2023, 23 (02): : 220 - 240
  • [42] The PF Disjunction Theorem to Southern Min/Mandarin code-switching
    Wang, Sung-Lan
    INTERNATIONAL JOURNAL OF BILINGUALISM, 2017, 21 (05) : 541 - 558
  • [43] A Study of Code-switching in the College English Classroom
    雷春晓
    海外英语, 2015, (02) : 105 - 106
  • [44] Congruence and Welsh-English code-switching
    Deuchar, M
    BILINGUALISM-LANGUAGE AND COGNITION, 2005, 8 (03) : 255 - 269
  • [45] Gender in Russian-English code-switching
    Chirsheva, Galina
    INTERNATIONAL JOURNAL OF BILINGUALISM, 2009, 13 (01) : 63 - 90
  • [46] Ghanaian English and code-switching in Catholic churches
    Albakry, Mohammed A.
    Ofori, Dominic M.
    WORLD ENGLISHES, 2011, 30 (04) : 515 - 532
  • [47] Learner code-switching versus English only
    Sampson, Andrew
    ELT JOURNAL, 2012, 66 (03) : 293 - 303
  • [48] Code-switching in South Asian English CMC
    Shakir, Muhammad
    Deuber, Dagmar
    ENGLISH WORLD-WIDE, 2024, 45 (03) : 311 - 341
  • [49] Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition
    Guo, Pengcheng
    Xu, Haihua
    Xie, Lei
    Chng, Eng Siong
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1928 - 1932
  • [50] Investigating Multi-task Learning for Automatic Speech Recognition with Code-switching between Mandarin and English
    Song, Xiao
    Zou, Yuexian
    Huang, Shilei
    Chen, Shaobin
    Liu, Yi
    2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 27 - 30