Exploring the Effect of Dialect Mismatched Language Models in Telugu Automatic Speech Recognition

被引:0
|
作者
Yadavalli, Aditya [1 ]
Mirishkar, Ganesh S. [1 ]
Vuppala, Anil Kumar [1 ]
机构
[1] Int Inst Informat Technol, Speech Proc Lab, Hyderabad 500032, Telangana, India
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous research has found that Acoustic Models (AM) of an Automatic Speech Recognition (ASR) system are susceptible to dialect variations within a language, thereby adversely affecting the ASR. To counter this, researchers have proposed to build a dialect-specific AM while keeping the Language Model (LM) constant for all the dialects. This study explores the effect of dialect mismatched LM by considering three different Telugu regional dialects: Telangana, Coastal Andhra, and Rayalaseema. We show that dialect variations that surface in the form of a different lexicon, grammar, and occasionally semantics can significantly degrade the performance of the LM under mismatched conditions. Therefore, this degradation has an adverse effect on the ASR even when dialect-specific AM is used. We show a degradation of up to 13.13 perplexity points when LM is used under mismatched conditions. Furthermore, we show a degradation of over 9% and over 15% in Character Error Rate (CER) and Word Error Rate (WER), respectively, in the ASR systems when using mismatched LMs over matched LMs.
引用
收藏
页码:292 / 301
页数:10
相关论文
共 50 条
  • [1] Automatic Speech Recognition for Mixed Dialect Utterances by Mixing Dialect Language Models
    Hirayama, Naoki
    Yoshino, Koichiro
    Itoyama, Katsutoshi
    Mori, Shinsuke
    Okuno, Hiroshi G.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (02) : 373 - 382
  • [2] Improving Automatic Speech Recognition with Dialect-Specific Language Models
    Gothi, Raj
    Rao, Preeti
    [J]. SPEECH AND COMPUTER, SPECOM 2023, PT I, 2023, 14338 : 57 - 67
  • [3] GEOGRAPHIC LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION
    Xiao, Xiaoqiang
    Chen, Hong
    Zylak, Mark
    Sosa, Daniela
    Desu, Suma
    Krishnamoorthy, Mahesh
    Liu, Daben
    Paulik, Matthias
    Zhang, Yuchen
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6124 - 6128
  • [4] JOINT LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING
    Bayer, Ali Orkan
    Riccardi, Giuseppe
    [J]. 2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 199 - 203
  • [5] Automatic Estimation of Dialect Mixing Ratio for Dialect Speech Recognition
    Hirayama, Naoki
    Yoshino, Koichiro
    Itoyama, Katsutoshi
    Mori, Shinsuke
    Okuno, Hiroshi G.
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1491 - 1495
  • [6] Telugu Dialect Speech Dataset Creation and Recognition using Deep Learning Techniques
    Podila, Rama Sai Abhishek
    Kommula, Ganga Sai Sudeep
    Ruthvik, K.
    Vekkot, Susmitha
    Gupta, Deepa
    [J]. 2022 IEEE 19TH INDIA COUNCIL INTERNATIONAL CONFERENCE, INDICON, 2022,
  • [7] Automatic speech recognition system for Tunisian dialect
    Abir Masmoudi
    Fethi Bougares
    Mariem Ellouze
    Yannick Estève
    Lamia Belguith
    [J]. Language Resources and Evaluation, 2018, 52 : 249 - 267
  • [8] Automatic speech recognition system for Tunisian dialect
    Masmoudi, Abir
    Bougares, Fethi
    Ellouze, Mariem
    Esteve, Yannick
    Belguith, Lamia
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2018, 52 (01) : 249 - 267
  • [9] Dialect recognition from Telugu speech utterances using spectral and prosodic features
    Shivaprasad, S.
    Sadanandam, M.
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 27 (2) : 515 - 515
  • [10] Error analysis to improve the speech recognition accuracy on Telugu language
    Rani, N. Usha
    Girija, P. N.
    [J]. SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2012, 37 (06): : 747 - 761