Dual Script E2E Framework for Multilingual and Code-Switching ASR

被引:1
|
作者
Kumar, Mari Ganesh [1 ]
Kuriakose, Jom [1 ]
Thyagachandran, Anand [1 ]
Kumar, Arun A. [1 ]
Seth, Ashish [1 ]
Prasad, Lodagala V. S. V. Durga [1 ]
Jaiswal, Saish [1 ]
Prakash, Anusha [1 ]
Murthy, Hema A. [1 ]
机构
[1] Indian Inst Technol Madras, Chennai, Tamil Nadu, India
来源
关键词
speech recognition; low-resource; multilingual; common label set; dual script;
D O I
10.21437/Interspeech.2021-978
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
India is home to multiple languages, and training automatic speech recognition (ASR) systems is challenging. Over time, each language has adopted words from other languages, such as English, leading to code-mixing. Most Indian languages also have their own unique scripts, which poses a major limitation in training multilingual and code-switching ASR systems. Inspired by results in text-to-speech synthesis, in this paper, we use an in-house rule-based phoneme-level common label set (CLS) representation to train multilingual and code-switching ASR for Indian languages. We propose two end-to-end (E2E) ASR systems. In the first system, the E2E model is trained on the CLS representation, and we use a novel data-driven back-end to recover the native language script. In the second system, we propose a modification to the E2E model, wherein the CLS representation and the native language characters are used simultaneously for training. We show our results on the multilingual and code-switching (MUCS) ASR challenge 2021. Our best results achieve approximate to 6% and 5% improvement in word error rate over the baseline system for the multilingual and code-switching tasks, respectively, on the challenge development data.
引用
下载
收藏
页码:2441 / 2445
页数:5
相关论文
共 50 条
  • [41] On Persistent Implications of E2E Testing
    Frajtak, Karel
    Cerny, Tomas
    ENTERPRISE INFORMATION SYSTEMS, ICEIS 2021, 2022, 455 : 326 - 338
  • [42] Acculturation and attitudes toward code-switching: A bidimensional framework
    Yim, Odilia
    Clement, Richard
    INTERNATIONAL JOURNAL OF BILINGUALISM, 2021, 25 (05) : 1369 - 1388
  • [43] Retrieval-oriented E2E ASR Modeling for Improved Query-by-example Spoken Term Detection
    Kurokawa, Takumi
    Kai, Atsuhiko
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1037 - 1042
  • [44] E2E数据采集网络
    张振华
    宫海波
    李国星
    中国科技信息, 2017, (06) : 67 - 70
  • [45] Code-switching, language mixing and fused lects: Emerging trends in multilingual Mauritius
    Auckle, Tejshree
    Barnes, Lawrie
    LANGUAGE MATTERS, 2011, 42 (01) : 104 - 125
  • [46] THE CATCHINESS OF CODE-SWITCHING: PLURILINGUALISM IN CATCHY (A ROMANIAN WOMEN'S E-ZINE)
    Haisan, Daniela
    STUDIA UNIVERSITATIS BABES-BOLYAI PHILOLOGIA, 2022, 67 (04): : 187 - 205
  • [47] Learning Multilingual Meta-Embeddings for Code-Switching Named Entity Recognition
    Winata, Genta Indra
    Lin, Zhaojiang
    Fung, Pascale
    4TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2019), 2019, : 181 - 186
  • [48] Multilingual practices in contemporary and historical contexts: interfaces between code-switching and translation
    Kolehmainen, Leena
    Skaffari, Janne
    MULTILINGUA-JOURNAL OF CROSS-CULTURAL AND INTERLANGUAGE COMMUNICATION, 2016, 35 (02): : 123 - 135
  • [49] An E2E Network Slicing Framework for Slice Creation and Deployment Using Machine Learning
    Venkatapathy, Sujitha
    Srinivasan, Thiruvenkadam
    Jo, Han-Gue
    Ra, In-Ho
    SENSORS, 2023, 23 (23)
  • [50] Hybrid CTC Language Identification Structure for Mandarin-English Code-Switching ASR
    Yin, Hengxin
    Hu, Guangyu
    Wang, Fei
    Ren, Pengfei
    2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 537 - 541