Bangladeshi Bangla speech corpus for automatic speech recognition research

被引：7

作者：

Kibria, Shafkat ^{[1
]}

Samin, Ahnaf Mozib ^{[1
]}

Kobir, M. Humayon ^{[1
]}

Rahman, M. Shahidur ^{[1
]}

Selim, M. Reza ^{[1
]}

Iqbal, M. Zafar ^{[1
]}

机构：

[1] Shahjalal Univ Sci & Technol, Dept Comp Sci & Engn, Sylhet 3114, Bangladesh

来源：

SPEECH COMMUNICATION | 2022年 / 136卷

关键词：

Bangladeshi bangla corpus; Automatic speech recognition; Corpora evaluation; Recurrent neural network;

D O I：

10.1016/j.specom.2021.12.004

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This article reports the development of language resource for Bangladeshi Bangla spoken language (BBSL). Bangladeshi Bangla has inadequate large speech corpora for Large Vocabulary Continuous Speech Recognition (LVCSR) system. The accuracy of the automatic speech recognition (ASR) system rests on the quality of the speech corpus. This work discusses the common issues and activities related to the development of a large speech corpus named (sic) (SUBAK.KO). This corpus is designed to support ASR research in Bangladeshi Bangla. It has been labeled sentence-wise. We have trained this corpus with one of the well-known current End-to-End ASR algorithms, Recurrent Neural Networks (RNNs) with Connectionist Temporal Classification (CTC). To know the strengths and weaknesses, the CER (Character Error Rate) and the WER (Word Error Rate) of the trained RNN-CTC model have been observed. Another open-source large Bangla ASR corpus has been trained using the same ASR algorithm. Both trained models have been compared to assess the quality of these corpora. It has been found that SUBAK.KO is a more balanced corpus and considered more regional accented speech variability for a LVCSR system compared to that open-source large Bangla ASR corpus.

引用

页码：84 / 97

页数：14

共 50 条

[41] Modern standard Arabic speech corpus for implementing and evaluating automatic continuous speech recognition systems
Abushariah, Mohammad Abd-Alrahman Mahmoud
Ainon, Raja Noor
Zainuddin, Roziati
Alqudah, Assal Ali Mustafa
Ahmed, Moustafa Elshafei
Khalifa, Othman Omran
JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2012, 349 (07): : 2215 - 2242
[42] An open and free Speech Corpus for Speaker Recognition: The FSCSR Speech Corpus
Bouziane, Ayoub
Kadi, Houda
Hourri, Soufiane
Kharroubi, Jamal
2016 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS: THEORIES AND APPLICATIONS (SITA), 2016,
[43] Implementation of Bangla Speech Recognition in Voice Input Speech Output (VISO) Calculator
Ahmed, Tasnim
Wahid, Md. Ferdous
Habib, Md. Ahsan
2018 INTERNATIONAL CONFERENCE ON BANGLA SPEECH AND LANGUAGE PROCESSING (ICBSLP), 2018,
[44] A speech recognition and speech corpus system based on Matlab
He, Q
Zhang, YW
PROCEEDINGS OF 2001 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2001, : 559 - 562
[45] AUTOMATIC SPEECH RECOGNITION
IVALL, T
ELECTRONICS & WIRELESS WORLD, 1984, 90 (1581): : 73 - 76
[46] Automatic speech recognition
O'Shaughnessy, Douglas
2015 CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON), 2015, : 417 - 424
[47] AUTOMATIC SPEECH RECOGNITION
RAO, PVS
PALIWAL, KK
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 1986, 9 : 85 - 120
[48] Urdu Speech Corpus and Preliminary Results on Speech Recognition
Ali, Hazrat
Ahmad, Nasir
Hafeez, Abdul
ENGINEERING APPLICATIONS OF NEURAL NETWORKS, EANN 2016, 2016, 629 : 317 - 325
[49] AUTOMATIC RECOGNITION OF SPEECH
MARILL, T
IRE TRANSACTIONS ON HUMAN FACTORS IN ELECTRONICS, 1961, HFE2 (01): : 34 - +
[50] Satja: Thai Elderly Speech Corpus for Speech Recognition
Prajongjai, Suphunnee
Triyason, Tuul
Mongkolnam, Pornchai
PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON ADVANCES IN INFORMATION TECHNOLOGY (IAIT2018), 2018,

← 1 2 3 4 5 →