Effects of Dialectal Code-Switching on Speech Modules: A Study using Egyptian Arabic Broadcast Speech

被引:4
|
作者
Chowdhury, Shammur A. [1 ]
Samih, Younes [1 ]
Eldesouki, Mohamed [2 ]
Ali, Ahmed [1 ]
机构
[1] HBKU, Qatar Comp Res Inst, Doha, Qatar
[2] Concordia Univ, Montreal, PQ, Canada
来源
关键词
code-switching; dialect identification; corpus; code mixing index;
D O I
10.21437/Interspeech.2020-2271
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
The intra-utterance code-switching (CS) is defined as the alternation between two or more languages within the same utterance. Despite the fact that spoken dialectal code-switching (DCS) is more challenging than CS, it remains largely unexplored. In this study, we describe a method to build the first spoken DCS corpus. The corpus is annotated at the token-level minding both linguistic and acoustic cues for dialectal Arabic. For detailed analysis, we study Arabic automatic speech recognition (ASR), Arabic dialect identification (ADI), and natural language processing (NLP) modules for the DCS corpus. Our results highlight the importance of lexical information for discriminating the DCS labels. We observe that the performance of different models is highly dependent on the degree of code-mixing at the token-level as well as its complexity at the utterance-level.
引用
收藏
页码:2382 / 2386
页数:5
相关论文
共 50 条
  • [21] Code-Switching in Automatic Speech Recognition: The Issues and Future Directions
    Mustafa, Mumtaz Begum
    Yusoof, Mansoor Ali
    Khalaf, Hasan Kahtan
    Abushariah, Ahmad Abdel Rahman Mahmoud
    Kiah, Miss Laiha Mat
    Hua Nong Ting
    Muthaiyah, Saravanan
    APPLIED SCIENCES-BASEL, 2022, 12 (19):
  • [22] Code-switching in multilinguals with dementia: patterns across speech contexts
    Svennevig, Jan
    Hansen, Pernille
    Simonsen, Hanne Gram
    Landmark, Anne Marie Dalby
    CLINICAL LINGUISTICS & PHONETICS, 2019, 33 (10-11) : 1009 - 1030
  • [23] Acoustic modeling for Thai-English code-switching speech
    Chunwijitra, Vataya
    Thatphithakkul, Sumonmas
    Chootrakool, Patcharika
    Kasuriya, Sawit
    PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 94 - 99
  • [24] JAPANESE-ENGLISH CODE-SWITCHING SPEECH DATA CONSTRUCTION
    Nakayama, Sahoko
    Kano, Takatomo
    Quoc Truong Do
    Sakti, Sakriani
    Nakamura, Satoshi
    2018 ORIENTAL COCOSDA - INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2018, : 67 - 71
  • [25] Open Domain Continuous Filipino Speech Recognition with Code-Switching
    Ang, Federico
    Miyanaga, Yoshikazu
    Guevara, Rowena Cristina
    Cajote, Rhandley
    Bayona, Michael Gringo Angelo
    2014 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2014, : 2301 - 2304
  • [26] BENCHMARKING EVALUATION METRICS FOR CODE-SWITCHING AUTOMATIC SPEECH RECOGNITION
    Hamed, Injy
    Hussein, Amir
    Chellah, Oumnia
    Chowdhury, Shammur
    Mubarak, Hamdy
    Sitaram, Sunayana
    Habash, Nizar
    Ali, Ahmed
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 999 - 1005
  • [27] Crowdsourcing Universal Part-Of-Speech Tags for Code-Switching
    Sow, Victor
    Hirschberg, Julia
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 77 - 81
  • [28] Language identification by using syllable-based duration classification on code-switching speech
    Lyu, Dau-cheng
    Lyu, Ren-yuan
    Chiang, Yuang-chin
    Hsu, Chun-nan
    CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 475 - +
  • [29] Code-Switching Language Modeling with Bilingual Word Embeddings: A Case Study for Egyptian Arabic-English
    Hamed, Injy
    Zhu, Moritz
    Elmahdy, Mohamed
    Abdennadher, Slim
    Vu, Ngoc Thang
    SPEECH AND COMPUTER, SPECOM 2019, 2019, 11658 : 160 - 170
  • [30] Semi-supervised acoustic model training for speech with code-switching
    Yilmaz, Emre
    McLaren, Mitchell
    van den Heuvel, Henk
    van Leeuwen, David A.
    SPEECH COMMUNICATION, 2018, 105 : 12 - 22