Efficiency of End-to-End Speech Recognition for Languages with Scarce Resources

被引:0
|
作者
Rudzionis, Vytautas [1 ]
Malukas, Ugnius [2 ]
Lopata, Audrius [2 ]
机构
[1] Vilnius Univ, Kaunas Fac, Muitines 8, Kaunas, Lithuania
[2] Kaunas Univ Technol, Studentu 50, Kaunas, Lithuania
关键词
Speech recognition; Hybrid methods; Machine learning; End-to-end speech recognition;
D O I
10.1007/978-3-031-16302-9_20
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modern deep learning based speech recognition methods allow for achieving phenomenal speech recognition accuracy. But it requires enormous amounts of data to train such systems to achieve high recognition accuracy. Many less widely spoken languages simply do not possess the necessary amounts of speech corpora. The paper presents attempts to evaluate DeepSpeech-based speech recognition efficiency with the limited amounts of training data available and the ways to improve the accuracy. The experiments showed that the accuracy of DeepSpeech2 recognizer with about 100 h of speech corpora used for training is quite modest but the application of simple grammatical constraints allowed to reduce the word error rate to 23-25%.
引用
下载
收藏
页码:259 / 264
页数:6
相关论文
共 50 条
  • [1] End-to-End Speech Recognition in Agglutinative Languages
    Mamyrbayev, Orken
    Alimhan, Keylan
    Zhumazhanov, Bagashar
    Turdalykyzy, Tolganay
    Gusmanova, Farida
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2020), PT II, 2020, 12034 : 391 - 401
  • [2] End-to-end Speech Recognition for Languages with Ideographic Characters
    Ito, Hitoshi
    Hagiwara, Aiko
    Ichiki, Manon
    Mishima, Takeshi
    Sato, Shoei
    Kobayashi, Akio
    2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1269 - 1273
  • [3] Integrated End-to-End Automatic Speech Recognition for Languages for Agglutinative Languages
    Bekarystankyzy, Akbayan
    Mamyrbayev, Orken
    Anarbekova, Tolganay
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (06)
  • [4] END-TO-END SPEECH RECOGNITION AND KEYWORD SEARCH ON LOW-RESOURCE LANGUAGES
    Rosenberg, Andrew
    Audhkhasi, Kartik
    Sethy, Abhinav
    Ramabhadran, Bhuvana
    Picheny, Michael
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5280 - 5284
  • [5] End-to-End Speech Recognition in Russian
    Markovnikov, Nikita
    Kipyatkova, Irina
    Lyakso, Elena
    SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 377 - 386
  • [6] END-TO-END MULTIMODAL SPEECH RECOGNITION
    Palaskar, Shruti
    Sanabria, Ramon
    Metze, Florian
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5774 - 5778
  • [7] Overview of end-to-end speech recognition
    Wang, Song
    Li, Guanyu
    2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018), 2019, 1187
  • [8] Multichannel End-to-end Speech Recognition
    Ochiai, Tsubasa
    Watanabe, Shinji
    Hori, Takaaki
    Hershey, John R.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [9] End-to-end Accented Speech Recognition
    Viglino, Thibault
    Motlicek, Petr
    Cernak, Milos
    INTERSPEECH 2019, 2019, : 2140 - 2144
  • [10] END-TO-END AUDIOVISUAL SPEECH RECOGNITION
    Petridis, Stavros
    Stafylakis, Themos
    Ma, Pingchuan
    Cai, Feipeng
    Tzimiropoulos, Georgios
    Pantic, Maja
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6548 - 6552