Enhancing Large Vocabulary Continuous Speech Recognition System for Urdu-English Conversational Code-Switched Speech

被引:0
|
作者
Farooq, Muhammad Umar [1 ]
Adeeba, Farah [1 ]
Hussain, Sarmad [1 ]
Rauf, Sahar [1 ]
Khalid, Maryam [1 ]
机构
[1] Univ Engn & Technol, Ctr Language Engn, Al Khawarizmi Inst Comp Sci, Lahore, Pakistan
关键词
Urdu-English code-switching; Urdu speech recognition; under-resourced language;
D O I
10.1109/o-cocosda50338.2020.9295036
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents first step towards Large Vocabulary Continuous Speech Recognition (LVCSR) system for Urdu-English code-switched conversational speech. Urdu is the national language and lingua franca of Pakistan, with 100 million speakers worldwide. English, on the other hand, is official language of Pakistan and commonly mixed with Urdu in daily communication. Urdu, being under-resourced language, have no substantial Urdu-English code-switched corpus in hand to develop speech recognition system. In this research, readily available spontaneous Urdu speech corpus (25 hours) is revised to use it for enhancement of read speech Urdu LVCSR to recognize code-switched speech. This data set is split into 20 hours of train and 5 hours of test set. 10 hours of Urdu BroadCast (BC) data are collected and annotated in a semi-supervised way to enhance the system further. For acoustic modeling, state-of-the-art DNN-HMM modeling technique is used without any prior GMM-HMM training and alignments. Various techniques to improve language model using monolingual data are investigated. The overall percent Word Error Rate (WER) is reduced from 40.71% to 26.95% on test set.
引用
收藏
页码:155 / 159
页数:5
相关论文
共 50 条
  • [31] Parallel Scalability in Speech Recognition Inference engines in large vocabulary continuous speech recognition
    You, Kisun
    Chong, Jike
    Yi, Youngmin
    Gonina, Ekaterina
    Hughes, Christopher J.
    Chen, Yen-Kuang
    Sung, Wonyong
    Keutzer, Kurt
    IEEE SIGNAL PROCESSING MAGAZINE, 2009, 26 (06) : 124 - 135
  • [32] Developments in large vocabulary, continuous speech recognition of German
    AddaDecker, M
    Adda, G
    Lamel, L
    Gauvain, JL
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 153 - 156
  • [33] Utilizing Lipreading in Large Vocabulary Continuous Speech Recognition
    Palecek, Karel
    SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 767 - 776
  • [34] Combating Reverberation in Large Vocabulary Continuous Speech Recognition
    Mitra, Vikramjit
    Van Hout, Julien
    McLaren, Mitchell
    Wang, Wen
    Graciarena, Martin
    Vergyri, Dimitra
    Franco, Horacio
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2449 - 2453
  • [35] Speech recognition on Mandarin Call Home: A large-vocabulary, conversational, and telephone speech corpus
    Liu, FH
    Picheny, M
    Srinivasa, P
    Monkowski, M
    Chen, JL
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 157 - 160
  • [36] Accent Issues in Large Vocabulary Continuous Speech Recognition
    Chao Huang
    Tao Chen
    Eric Chang
    International Journal of Speech Technology, 2004, 7 (2-3) : 141 - 153
  • [37] Experimenting with lipreading for large vocabulary continuous speech recognition
    Palecek, Karel
    JOURNAL ON MULTIMODAL USER INTERFACES, 2018, 12 (04) : 309 - 318
  • [38] Confidence measures for large vocabulary continuous speech recognition
    Wessel, F
    Schlüter, R
    Macherey, K
    Ney, H
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (03): : 288 - 298
  • [39] CONNECTIONIST APPROACHES TO LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    SAWAI, H
    MINAMI, Y
    MIYATAKE, M
    WAIBEL, A
    SHIKANO, K
    IEICE TRANSACTIONS ON COMMUNICATIONS ELECTRONICS INFORMATION AND SYSTEMS, 1991, 74 (07): : 1834 - 1844
  • [40] Boosting systems for large vocabulary continuous speech recognition
    Saon, George
    Soltau, Hagen
    SPEECH COMMUNICATION, 2012, 54 (02) : 212 - 218