Bangladeshi Bangla speech corpus for automatic speech recognition research

被引:7
|
作者
Kibria, Shafkat [1 ]
Samin, Ahnaf Mozib [1 ]
Kobir, M. Humayon [1 ]
Rahman, M. Shahidur [1 ]
Selim, M. Reza [1 ]
Iqbal, M. Zafar [1 ]
机构
[1] Shahjalal Univ Sci & Technol, Dept Comp Sci & Engn, Sylhet 3114, Bangladesh
关键词
Bangladeshi bangla corpus; Automatic speech recognition; Corpora evaluation; Recurrent neural network;
D O I
10.1016/j.specom.2021.12.004
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This article reports the development of language resource for Bangladeshi Bangla spoken language (BBSL). Bangladeshi Bangla has inadequate large speech corpora for Large Vocabulary Continuous Speech Recognition (LVCSR) system. The accuracy of the automatic speech recognition (ASR) system rests on the quality of the speech corpus. This work discusses the common issues and activities related to the development of a large speech corpus named (sic) (SUBAK.KO). This corpus is designed to support ASR research in Bangladeshi Bangla. It has been labeled sentence-wise. We have trained this corpus with one of the well-known current End-to-End ASR algorithms, Recurrent Neural Networks (RNNs) with Connectionist Temporal Classification (CTC). To know the strengths and weaknesses, the CER (Character Error Rate) and the WER (Word Error Rate) of the trained RNN-CTC model have been observed. Another open-source large Bangla ASR corpus has been trained using the same ASR algorithm. Both trained models have been compared to assess the quality of these corpora. It has been found that SUBAK.KO is a more balanced corpus and considered more regional accented speech variability for a LVCSR system compared to that open-source large Bangla ASR corpus.
引用
收藏
页码:84 / 97
页数:14
相关论文
共 50 条
  • [41] Modern standard Arabic speech corpus for implementing and evaluating automatic continuous speech recognition systems
    Abushariah, Mohammad Abd-Alrahman Mahmoud
    Ainon, Raja Noor
    Zainuddin, Roziati
    Alqudah, Assal Ali Mustafa
    Ahmed, Moustafa Elshafei
    Khalifa, Othman Omran
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2012, 349 (07): : 2215 - 2242
  • [42] An open and free Speech Corpus for Speaker Recognition: The FSCSR Speech Corpus
    Bouziane, Ayoub
    Kadi, Houda
    Hourri, Soufiane
    Kharroubi, Jamal
    2016 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS: THEORIES AND APPLICATIONS (SITA), 2016,
  • [43] Implementation of Bangla Speech Recognition in Voice Input Speech Output (VISO) Calculator
    Ahmed, Tasnim
    Wahid, Md. Ferdous
    Habib, Md. Ahsan
    2018 INTERNATIONAL CONFERENCE ON BANGLA SPEECH AND LANGUAGE PROCESSING (ICBSLP), 2018,
  • [44] A speech recognition and speech corpus system based on Matlab
    He, Q
    Zhang, YW
    PROCEEDINGS OF 2001 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2001, : 559 - 562
  • [45] AUTOMATIC SPEECH RECOGNITION
    IVALL, T
    ELECTRONICS & WIRELESS WORLD, 1984, 90 (1581): : 73 - 76
  • [46] Automatic speech recognition
    O'Shaughnessy, Douglas
    2015 CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON), 2015, : 417 - 424
  • [47] AUTOMATIC SPEECH RECOGNITION
    RAO, PVS
    PALIWAL, KK
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 1986, 9 : 85 - 120
  • [48] Urdu Speech Corpus and Preliminary Results on Speech Recognition
    Ali, Hazrat
    Ahmad, Nasir
    Hafeez, Abdul
    ENGINEERING APPLICATIONS OF NEURAL NETWORKS, EANN 2016, 2016, 629 : 317 - 325
  • [49] AUTOMATIC RECOGNITION OF SPEECH
    MARILL, T
    IRE TRANSACTIONS ON HUMAN FACTORS IN ELECTRONICS, 1961, HFE2 (01): : 34 - +
  • [50] Satja: Thai Elderly Speech Corpus for Speech Recognition
    Prajongjai, Suphunnee
    Triyason, Tuul
    Mongkolnam, Pornchai
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON ADVANCES IN INFORMATION TECHNOLOGY (IAIT2018), 2018,