Indian Languages Corpus for Speech Recognition

被引：0

作者：

Basu, Joyanta ^{[1
]}

Khan, Soma ^{[1
]}

Roy, Rajib ^{[1
]}

Saxena, Babita ^{[1
]}

Ganguly, Dipankar ^{[1
]}

Arora, Sunita ^{[1
]}

Arora, Karunesh Kumar ^{[1
]}

Bansal, Shweta ^{[2
]}

Agrawal, Shyam Sunder ^{[2
]}

机构：

[1] CDAC, Kolkata, India

[2] KIIT Coll Engn Gurgaon, Gurgaon, India

来源：

2019 22ND CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA) | 2019年

关键词：

Speech Corpus; IVR; Transcription;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Robust Speech Recognition System for various languages have transcended beyond research labs to commercial products. It has been possible owing to the major developments in the area of machine learning, especially deep learning. However, development of advanced speech recognition systems could be leveraged only with the availability of specially curetted speech data. Such systems having usable quality are yet to be developed for most of the Indian languages. The present paper describes the design and development of a standard speech corpora which can be used for developing general purpose ASR systems and benchmarking them. This database has been developed for Indian languages namely Hindi, Bengali and Indian English. The corpus design incorporates important parameters such as phonetic coverage and distribution. The data was recorded by 1500 speakers in each language by male and female speakers of different age groups in varying environments. The data was recorded on a server using online recording system and transcribed using semi-automatic tools. The paper describes the corpus designing methodology, challenges faced and approach adopted to overcome them. The whole process of designing speech database has been generic enough to be used for other languages as well.

引用

页码：13 / 18

页数：6

共 50 条

[1] IndicSpeech: Text-to-Speech Corpus for Indian Languages
Srivastava, Nimisha
Mukhopadhyay, Rudrabha
Prajwal, K. R.
Jawahar, C., V
[J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6417 - 6422
[2] Huqariq: A Multilingual Speech Corpus of Native Languages of Peru for Speech Recognition
Zevallos, Rodolfo
Camacho, Luis
Melgarejo, Nelsi
[J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 5029 - 5034
[3] ASRoIL: a comprehensive survey for automatic speech recognition of Indian languages
Amitoj Singh
Virender Kadyan
Munish Kumar
Nancy Bassan
[J]. Artificial Intelligence Review, 2020, 53 : 3673 - 3704
[4] ASRoIL: a comprehensive survey for automatic speech recognition of Indian languages
Singh, Amitoj
Kadyan, Virender
Kumar, Munish
Bassan, Nancy
[J]. ARTIFICIAL INTELLIGENCE REVIEW, 2020, 53 (05) : 3673 - 3704
[5] WTASR: Wavelet Transformer for Automatic Speech Recognition of Indian Languages
Choudhary, Tripti
Goyal, Vishal
Bansal, Atul
[J]. BIG DATA MINING AND ANALYTICS, 2023, 6 (01) : 85 - 91
[6] PHONETIC AND PROSODICALLY RICH TRANSCRIBED SPEECH CORPUS IN INDIAN LANGUAGES : BENGALI AND ODIA
Kumar, Sunil S. B.
Rao, K. Sreenivasa
Pati, Debadatta
[J]. 2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2013,
[7] A Multilingual to Polyglot Speech Synthesizer for Indian Languages Using a Voice-Converted Polyglot Speech Corpus
P. Vijayalakshmi
B. Ramani
M. P. Actlin Jeeva
T. Nagarajan
[J]. Circuits, Systems, and Signal Processing, 2018, 37 : 2142 - 2163
[8] A Multilingual to Polyglot Speech Synthesizer for Indian Languages Using a Voice-Converted Polyglot Speech Corpus
Vijayalakshmi, P.
Ramani, B.
Jeeva, M. P. Actlin
Nagarajan, T.
[J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2018, 37 (05) : 2142 - 2163
[9] Development of speech corpora for speaker recognition research and evaluation in Indian languages
Patil, Hemant
Basu, T.
[J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2008, 11 (01) : 17 - 32
[10] A Speech Emotion Recognition Method in Cross-languages corpus Based on Feature Adaptation
Zhang, Xinran
Xiao, Gang
Zha, Cheng
Zhao, Li
[J]. 2015 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY SYSTEMS AND INNOVATION (ICITSI), 2015,

← 1 2 3 4 5 →