Efficiency of End-to-End Speech Recognition for Languages with Scarce Resources

被引:0
|
作者
Rudzionis, Vytautas [1 ]
Malukas, Ugnius [2 ]
Lopata, Audrius [2 ]
机构
[1] Vilnius Univ, Kaunas Fac, Muitines 8, Kaunas, Lithuania
[2] Kaunas Univ Technol, Studentu 50, Kaunas, Lithuania
关键词
Speech recognition; Hybrid methods; Machine learning; End-to-end speech recognition;
D O I
10.1007/978-3-031-16302-9_20
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modern deep learning based speech recognition methods allow for achieving phenomenal speech recognition accuracy. But it requires enormous amounts of data to train such systems to achieve high recognition accuracy. Many less widely spoken languages simply do not possess the necessary amounts of speech corpora. The paper presents attempts to evaluate DeepSpeech-based speech recognition efficiency with the limited amounts of training data available and the ways to improve the accuracy. The experiments showed that the accuracy of DeepSpeech2 recognizer with about 100 h of speech corpora used for training is quite modest but the application of simple grammatical constraints allowed to reduce the word error rate to 23-25%.
引用
下载
收藏
页码:259 / 264
页数:6
相关论文
共 50 条
  • [31] Adapting End-to-End Speech Recognition for Readable Subtitles
    Liu, Danni
    Niehues, Jan
    Spanakis, Gerasimos
    17TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2020), 2020, : 247 - 256
  • [32] Hybrid end-to-end model for Kazakh speech recognition
    Mamyrbayev O.Z.
    Oralbekova D.O.
    Alimhan K.
    Nuranbayeva B.M.
    International Journal of Speech Technology, 2023, 26 (02) : 261 - 270
  • [33] Insights on Neural Representations for End-to-End Speech Recognition
    Ollerenshaw, Anna
    Jalal, Asif
    Hain, Thomas
    INTERSPEECH 2021, 2021, : 4079 - 4083
  • [34] End-to-End Speech Emotion Recognition With Gender Information
    Sun, Ting-Wei
    IEEE ACCESS, 2020, 8 (08): : 152423 - 152438
  • [35] Residual Language Model for End-to-end Speech Recognition
    Tsunoo, Emiru
    Kashiwagi, Yosuke
    Narisetty, Chaitanya
    Watanabe, Shinji
    INTERSPEECH 2022, 2022, : 3899 - 3903
  • [36] DEEP CONTEXT: END-TO-END CONTEXTUAL SPEECH RECOGNITION
    Pundak, Golan
    Sainath, Tara N.
    Prabhavalkar, Rohit
    Kannan, Anjuli
    Zhao, Ding
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 418 - 425
  • [37] End-to-end Speech-to-Punctuated-Text Recognition
    Nozaki, Jumon
    Kawahara, Tatsuya
    Ishizuka, Kenkichi
    Hashimoto, Taiichi
    INTERSPEECH 2022, 2022, : 1811 - 1815
  • [38] EXPLORING NEURAL TRANSDUCERS FOR END-TO-END SPEECH RECOGNITION
    Battenberg, Eric
    Chen, Jitong
    Child, Rewon
    Coates, Adam
    Gaur, Yashesh
    Li, Yi
    Liu, Hairong
    Satheesh, Sanjeev
    Sriram, Anuroop
    Zhu, Zhenyao
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 206 - 213
  • [39] Semi-Supervised End-to-End Speech Recognition
    Karita, Shigeki
    Watanabe, Shinji
    Iwata, Tomoharu
    Ogawa, Atsunori
    Delcroix, Marc
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2 - 6
  • [40] END-TO-END SPEECH RECOGNITION WITH ADAPTIVE COMPUTATION STEPS
    Li, Mohan
    Liu, Min
    Masanori, Hattori
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6246 - 6250