Enhancing Large Vocabulary Continuous Speech Recognition System for Urdu-English Conversational Code-Switched Speech

被引:0
|
作者
Farooq, Muhammad Umar [1 ]
Adeeba, Farah [1 ]
Hussain, Sarmad [1 ]
Rauf, Sahar [1 ]
Khalid, Maryam [1 ]
机构
[1] Univ Engn & Technol, Ctr Language Engn, Al Khawarizmi Inst Comp Sci, Lahore, Pakistan
关键词
Urdu-English code-switching; Urdu speech recognition; under-resourced language;
D O I
10.1109/o-cocosda50338.2020.9295036
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents first step towards Large Vocabulary Continuous Speech Recognition (LVCSR) system for Urdu-English code-switched conversational speech. Urdu is the national language and lingua franca of Pakistan, with 100 million speakers worldwide. English, on the other hand, is official language of Pakistan and commonly mixed with Urdu in daily communication. Urdu, being under-resourced language, have no substantial Urdu-English code-switched corpus in hand to develop speech recognition system. In this research, readily available spontaneous Urdu speech corpus (25 hours) is revised to use it for enhancement of read speech Urdu LVCSR to recognize code-switched speech. This data set is split into 20 hours of train and 5 hours of test set. 10 hours of Urdu BroadCast (BC) data are collected and annotated in a semi-supervised way to enhance the system further. For acoustic modeling, state-of-the-art DNN-HMM modeling technique is used without any prior GMM-HMM training and alignments. Various techniques to improve language model using monolingual data are investigated. The overall percent Word Error Rate (WER) is reduced from 40.71% to 26.95% on test set.
引用
收藏
页码:155 / 159
页数:5
相关论文
共 50 条
  • [1] The 2001 BYBLOS english large vocabulary conversational speech recognition system
    Matsoukas, S
    Colthurst, T
    Kimball, O
    Solomonoff, A
    Richardson, F
    Quillen, C
    Gish, H
    Dognin, P
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 721 - 724
  • [2] Phone Merging for Code-switched Speech Recognition
    Sivasankaran, Sunit
    Srivastava, Brij Mohan Lal
    Sitaram, Sunayana
    Bali, Kalika
    Choudhury, Monojit
    COMPUTATIONAL APPROACHES TO LINGUISTIC CODE-SWITCHING, 2018, : 11 - 19
  • [3] TRANSFORMER-TRANSDUCERS FOR CODE-SWITCHED SPEECH RECOGNITION
    Dalmia, Siddharth
    Liu, Yuzong
    Ronanki, Srikanth
    Kirchhoff, Katrin
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5859 - 5863
  • [4] Homophone Identification and Merging for Code-switched Speech Recognition
    Srivastava, Brij Mohan Lal
    Sitara, Sunayana
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1943 - 1947
  • [5] Two sepedi-english code-switched speech corpora
    Modipa, Thipe, I
    Davel, Marelie H.
    LANGUAGE RESOURCES AND EVALUATION, 2022, 56 (03) : 703 - 727
  • [6] Automatic Speech Recognition of English-isiZulu Code-switched Speech from South African Soap Operas
    van der Westhuizen, Ewald
    Niesler, Thomas
    SLTU-2016 5TH WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGIES FOR UNDER-RESOURCED LANGUAGES, 2016, 81 : 121 - 127
  • [7] Two sepedi-english code-switched speech corpora
    Thipe I. Modipa
    Marelie H. Davel
    Language Resources and Evaluation, 2022, 56 : 703 - 727
  • [8] Meta-Transfer Learning for Code-Switched Speech Recognition
    Winata, Genta Indra
    Cahyawijaya, Samuel
    Lin, Zhaojiang
    Liu, Zihan
    Xu, Peng
    Fung, Pascale
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3770 - 3776
  • [9] The RWTH large vocabulary continuous speech recognition system
    Ney, H
    Welling, L
    Ortmanns, S
    Beulen, K
    Wessel, F
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 853 - 856
  • [10] A Myanmar Large Vocabulary Continuous Speech Recognition System
    Naing, Hay Mar Soe
    Hlaing, Aye Mya
    Pa, Win Pa
    Hu, Xinhui
    Thu, Ye Kyaw
    Hori, Chiori
    Kawai, Hisashi
    2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 320 - 327