ASRoIL: a comprehensive survey for automatic speech recognition of Indian languages

被引:27
|
作者
Singh, Amitoj [1 ]
Kadyan, Virender [2 ]
Kumar, Munish [1 ]
Bassan, Nancy [3 ]
机构
[1] Maharaja Ranjit Singh Punjab Tech Univ, Dept Computat Sci, Bathinda, Punjab, India
[2] Chitkara Univ, Inst Engn & Technol, Dept Comp Sci & Engn, Rajpura, Punjab, India
[3] Baba Farid Coll Engn & Technol, Dept Mech Engn, Bathinda, Punjab, India
关键词
Automatic speech recognition; Indian languages; Feature extraction techniques; Classification techniques; Speech corpus; EMOTION RECOGNITION; SPEAKER VERIFICATION; WORD RECOGNITION; SYSTEM; FEATURES; CLASSIFICATION; MODEL; PERFORMANCE; ACCURACY; DATABASE;
D O I
10.1007/s10462-019-09775-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
India is the land of language diversity with 22 major languages having more than 720 dialects, written in 13 different scripts. Out of 22, Hindi, Bengali, Punjabi is ranked 3rd, 7th and 10th most spoken languages around the globe. Expect Hindi, where one can find some significant research going on, other two major languages and other Indian languages have not fully developed Automatic Speech Recognition systems. The main aim of this paper is to provide a systematic survey of the existing literature related to automatic speech recognition (i.e. speech to text) for Indian languages. The survey analyses the possible opportunities, challenges, techniques, methods and to locate, appraise and synthesize the evidence from studies to provide empirical answers to the scientific questions. The survey was conducted based on the relevant research articles published from 2000 to 2018. The purpose of this systematic survey is to sum up the best available research on automatic speech recognition of Indian languages that is done by synthesizing the results of several studies.
引用
收藏
页码:3673 / 3704
页数:32
相关论文
共 50 条
  • [1] ASRoIL: a comprehensive survey for automatic speech recognition of Indian languages
    Amitoj Singh
    Virender Kadyan
    Munish Kumar
    Nancy Bassan
    Artificial Intelligence Review, 2020, 53 : 3673 - 3704
  • [2] WTASR: Wavelet Transformer for Automatic Speech Recognition of Indian Languages
    Choudhary, Tripti
    Goyal, Vishal
    Bansal, Atul
    BIG DATA MINING AND ANALYTICS, 2023, 6 (01) : 85 - 91
  • [3] Automatic speech recognition for under-resourced languages: A survey
    Besacier, Laurent
    Barnard, Etienne
    Karpov, Alexey
    Schultz, Tanja
    SPEECH COMMUNICATION, 2014, 56 : 85 - 100
  • [4] A comprehensive survey on automatic speech recognition using neural networks
    Amandeep Singh Dhanjal
    Williamjeet Singh
    Multimedia Tools and Applications, 2024, 83 : 23367 - 23412
  • [5] A comprehensive survey on automatic speech recognition using neural networks
    Dhanjal, Amandeep Singh
    Singh, Williamjeet
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (8) : 23367 - 23412
  • [6] Indian Languages Corpus for Speech Recognition
    Basu, Joyanta
    Khan, Soma
    Roy, Rajib
    Saxena, Babita
    Ganguly, Dipankar
    Arora, Sunita
    Arora, Karunesh Kumar
    Bansal, Shweta
    Agrawal, Shyam Sunder
    2019 22ND CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA), 2019, : 13 - 18
  • [7] Automatic Speech Recognition System for Tonal Languages: State-of-the-Art Survey
    Kaur, Jaspreet
    Singh, Amitoj
    Kadyan, Virender
    ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, 2021, 28 (03) : 1039 - 1068
  • [8] Automatic Speech Recognition System for Tonal Languages: State-of-the-Art Survey
    Jaspreet Kaur
    Amitoj Singh
    Virender Kadyan
    Archives of Computational Methods in Engineering, 2021, 28 : 1039 - 1068
  • [9] Automatic speech recognition: a survey
    Mishaim Malik
    Muhammad Kamran Malik
    Khawar Mehmood
    Imran Makhdoom
    Multimedia Tools and Applications, 2021, 80 : 9411 - 9457
  • [10] A survey on automatic speech recognition
    Nakagawa, Seiichi
    IEICE Transactions on Information and Systems, 2002, E85-D (03) : 465 - 486