Effects of Dialectal Code-Switching on Speech Modules: A Study using Egyptian Arabic Broadcast Speech

被引:4
|
作者
Chowdhury, Shammur A. [1 ]
Samih, Younes [1 ]
Eldesouki, Mohamed [2 ]
Ali, Ahmed [1 ]
机构
[1] HBKU, Qatar Comp Res Inst, Doha, Qatar
[2] Concordia Univ, Montreal, PQ, Canada
来源
关键词
code-switching; dialect identification; corpus; code mixing index;
D O I
10.21437/Interspeech.2020-2271
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
The intra-utterance code-switching (CS) is defined as the alternation between two or more languages within the same utterance. Despite the fact that spoken dialectal code-switching (DCS) is more challenging than CS, it remains largely unexplored. In this study, we describe a method to build the first spoken DCS corpus. The corpus is annotated at the token-level minding both linguistic and acoustic cues for dialectal Arabic. For detailed analysis, we study Arabic automatic speech recognition (ASR), Arabic dialect identification (ADI), and natural language processing (NLP) modules for the DCS corpus. Our results highlight the importance of lexical information for discriminating the DCS labels. We observe that the performance of different models is highly dependent on the degree of code-mixing at the token-level as well as its complexity at the utterance-level.
引用
收藏
页码:2382 / 2386
页数:5
相关论文
共 50 条
  • [41] Lattice-based Data Augmentation for Code-switching Speech Recognition
    Hartanto, Roland
    Uto, Kuniaki
    Shinoda, Koichi
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1667 - 1672
  • [42] End-to-End Language Diarization for Bilingual Code-Switching Speech
    Liu, Hexin
    Perera, Leibny Paola Garcia
    Zhang, Xinyi
    Dauwels, Justin
    Khong, Andy W. H.
    Khudanpur, Sanjeev
    Styles, Suzy J.
    INTERSPEECH 2021, 2021, : 1489 - 1493
  • [43] MECOS: A bilingual Manipuri-English spontaneous code-switching speech corpus for automatic speech recognition
    Singh, Naorem Karline
    Chanu, Yambem Jina
    Pangsatabam, Hoomexsun
    COMPUTER SPEECH AND LANGUAGE, 2024, 87
  • [44] Abriendo closings in bilingual radio speech: Discourse strategies, code-switching, and the interactive construction of broadcast structures and institutional identity
    Tseng, Amelia
    TEXT & TALK, 2018, 38 (04) : 481 - 502
  • [45] Head directionality and intrasentential code-switching: A study of Japanese Canadian and Korean Americans' bilingual speech
    Nishimura, M
    Yoon, KK
    JAPANESE/KOREAN LINGUISTICS, VOL 8, 1998, : 121 - 130
  • [46] ArzEn: A Speech Corpus for Code-switched Egyptian Arabic-English
    Hamed, Injy
    Ngoc Thang Vu
    Abdennadher, Slim
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 4237 - 4246
  • [47] Improving N-gram Language Modeling for Code-switching Speech Recognition
    Zeng, Zhiping
    Xu, Haihua
    Chong, Tze Yuang
    Chng, Eng-Siong
    Li, Haizhou
    2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1546 - 1551
  • [48] AN EVALUATION BENCHMARK FOR AUTOMATIC SPEECH RECOGNITION OF GERMAN-ENGLISH CODE-SWITCHING
    Khosravani, Abbas
    Garner, Philip N.
    Lazaridis, Alexandros
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 811 - 816
  • [49] CodeFed: Federated Speech Recognition for Low-Resource Code-Switching Detection
    Madan, Chetan
    Diddee, Harshita
    Kumar, Deepika
    Mittal, Mamta
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (01)
  • [50] A SOCIO-PRAGMATIC ANALYSIS OF CODE-SWITCHING IN THE LOGOLI SPEECH COMMUNITY OF KANGEMI
    Gimode, Jescah
    Barnes, Lawrie
    LANGUAGE MATTERS, 2015, 46 (02) : 249 - 274