Indian Languages Corpus for Speech Recognition

被引:0
|
作者
Basu, Joyanta [1 ]
Khan, Soma [1 ]
Roy, Rajib [1 ]
Saxena, Babita [1 ]
Ganguly, Dipankar [1 ]
Arora, Sunita [1 ]
Arora, Karunesh Kumar [1 ]
Bansal, Shweta [2 ]
Agrawal, Shyam Sunder [2 ]
机构
[1] CDAC, Kolkata, India
[2] KIIT Coll Engn Gurgaon, Gurgaon, India
关键词
Speech Corpus; IVR; Transcription;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Robust Speech Recognition System for various languages have transcended beyond research labs to commercial products. It has been possible owing to the major developments in the area of machine learning, especially deep learning. However, development of advanced speech recognition systems could be leveraged only with the availability of specially curetted speech data. Such systems having usable quality are yet to be developed for most of the Indian languages. The present paper describes the design and development of a standard speech corpora which can be used for developing general purpose ASR systems and benchmarking them. This database has been developed for Indian languages namely Hindi, Bengali and Indian English. The corpus design incorporates important parameters such as phonetic coverage and distribution. The data was recorded by 1500 speakers in each language by male and female speakers of different age groups in varying environments. The data was recorded on a server using online recording system and transcribed using semi-automatic tools. The paper describes the corpus designing methodology, challenges faced and approach adopted to overcome them. The whole process of designing speech database has been generic enough to be used for other languages as well.
引用
收藏
页码:13 / 18
页数:6
相关论文
共 50 条
  • [41] Phoneme-to-Speech Dictionary for Indian Languages
    Reddy, Mallamma V.
    Mary, Margaret T.
    Hanumanthappa, M.
    [J]. PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON SOFT-COMPUTING AND NETWORKS SECURITY (ICSNS 2015), 2015,
  • [42] A survey of hate speech detection in Indian languages
    Arpan Nandi
    Kamal Sarkar
    Arjun Mallick
    Arkadeep De
    [J]. Social Network Analysis and Mining, 14
  • [43] Building Speech Synthesis Systems for Indian Languages
    Pradhan, Abhijit
    Prakash, Anusha
    Shanmugam, S. Aswin
    Kasthuri, G. R.
    Krishnan, Raghava
    Murthy, Hema A.
    [J]. 2015 TWENTY FIRST NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2015,
  • [44] A survey on speech synthesis techniques in Indian languages
    Soumya Priyadarsini Panda
    Ajit Kumar Nayak
    Satyananda Champati Rai
    [J]. Multimedia Systems, 2020, 26 : 453 - 478
  • [45] A survey on speech synthesis techniques in Indian languages
    Panda, Soumya Priyadarsini
    Nayak, Ajit Kumar
    Rai, Satyananda Champati
    [J]. MULTIMEDIA SYSTEMS, 2020, 26 (04) : 453 - 478
  • [46] A survey of hate speech detection in Indian languages
    Nandi, Arpan
    Sarkar, Kamal
    Mallick, Arjun
    De, Arkadeep
    [J]. SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)
  • [47] DEVELOPMENT OF NEW SPEECH CORPUS FOR ELDERLY JAPANESE SPEECH RECOGNITION
    Iribe, Yurie
    Kitaoka, Norihide
    Segawa, Shuhei
    [J]. 2015 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2015 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2015, : 27 - 31
  • [48] Chhattisgarhi speech corpus for research and development in automatic speech recognition
    Londhe, Narendra D.
    Kshirsagar, Ghanahshyam B.
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (02) : 193 - 210
  • [49] RSC: A Romanian Read Speech Corpus for Automatic Speech Recognition
    Georgescu, Alexandru-Lucian
    Cucu, Horia
    Buzo, Andi
    Burileanu, Corneliu
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6606 - 6612
  • [50] Bangladeshi Bangla speech corpus for automatic speech recognition research
    Kibria, Shafkat
    Samin, Ahnaf Mozib
    Kobir, M. Humayon
    Rahman, M. Shahidur
    Selim, M. Reza
    Iqbal, M. Zafar
    [J]. SPEECH COMMUNICATION, 2022, 136 : 84 - 97