A comprehensive survey on automatic speech recognition using neural networks

被引:20
|
作者
Dhanjal, Amandeep Singh [1 ]
Singh, Williamjeet [2 ]
机构
[1] Punjabi Univ, Dept Comp Sci, Rajpura Rd, Patiala 147001, Punjab, India
[2] Punjabi Univ, Dept Comp Sci & Engn, Rajpura Rd, Patiala 147001, Punjab, India
关键词
Speech recognition; Dataset; Tools; Neural network; Deep learning; ARABIC SPEECH; SYSTEM; NOISE; HMM; ARCHITECTURES; SEGMENTATION; PRIMER;
D O I
10.1007/s11042-023-16438-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The continuous development in Automatic Speech Recognition has grown and demonstrated its enormous potential in Human Interaction Communication systems. It is quite a challenging task to achieve high accuracy due to several parameters such as different dialects, spontaneous speech, speaker's enrolment, computation power, dataset, and noisy environment that decrease the performance of the speech recognition system. It has motivated various researchers to make innovative contributions to the development of a robust speech recognition system. The study presents a systematic analysis of current state-of-the-art research work done in this field during 2015-2021. The prime focus of the study is to highlight the neural network-based speech recognition techniques, datasets, toolkits, and evaluation metrics utilized in the past seven years. It also synthesizes the evidence from past studies to provide empirical solutions for accuracy improvement. This study highlights the current status of speech recognition systems using neural networks and provides a brief knowledge to the new researchers.
引用
收藏
页码:23367 / 23412
页数:46
相关论文
共 50 条
  • [31] NETWORKS FOR SPEECH ENHANCEMENT AND AUTOMATIC SPEECH RECOGNITION
    Vu, Thanh T.
    Bigot, Benjamin
    Chng, Eng Siong
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 499 - 503
  • [32] Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition
    Sainath, Tara N.
    Weiss, Ron J.
    Wilson, Kevin W.
    Li, Bo
    Narayanan, Arun
    Variani, Ehsan
    Bacchiani, Michiel
    Shafran, Izhak
    Senior, Andrew
    Chin, Kean
    Misra, Ananya
    Kim, Chanwoo
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (05) : 965 - 979
  • [33] Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition
    Wu, Jibin
    Yilmaz, Emre
    Zhang, Malu
    Li, Haizhou
    Tan, Kay Chen
    FRONTIERS IN NEUROSCIENCE, 2020, 14
  • [34] Neural Networks for Proper Name Retrieval in the Framework of Automatic Speech Recognition
    Fohr, Dominique
    Illina, Irina
    2015 6TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS AND ECONOMIC INTELLIGENCE (SIIE), 2015, : 25 - 30
  • [35] An Efficient Noise-Robust Automatic Speech Recognition System using Artificial Neural Networks
    Gupta, Santosh
    Bhurchandi, Kishor M.
    Keskar, Avinash G.
    2016 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), VOL. 1, 2016, : 1873 - 1877
  • [36] Integrated system approach for the automatic speech recognition using linear predict coding and neural networks
    Duran Acevedo, Cristhian Manuel
    Gallo Nieves, Martin
    CERMA 2007: ELECTRONICS, ROBOTICS AND AUTOMOTIVE MECHANICS CONFERENCE, PROCEEDINGS, 2007, : 207 - 212
  • [37] Automatic Speech Emotion Recognition: A Survey
    Chandrasekar, Purnima
    Chapaneri, Santosh
    Jayaswal, Deepak
    2014 INTERNATIONAL CONFERENCE ON CIRCUITS, SYSTEMS, COMMUNICATION AND INFORMATION TECHNOLOGY APPLICATIONS (CSCITA), 2014, : 341 - 346
  • [38] SVMs for Automatic Speech Recognition:: A survey
    Solera-Urena, R.
    Padrell-Sendra, J.
    Martin-Iglesias, D.
    Gallardo-Antolin, A.
    Pelaez-Moreno, C.
    Diaz-de-Maria, F.
    PROGRESS IN NONLINEAR SPEECH PROCESSING, 2007, 4391 : 190 - +
  • [39] Speech Emotion Recognition: A Comprehensive Survey
    Mohammed Jawad Al-Dujaili
    Abbas Ebrahimi-Moghadam
    Wireless Personal Communications, 2023, 129 : 2525 - 2561
  • [40] Speech Emotion Recognition: A Comprehensive Survey
    Al-Dujaili, Mohammed Jawad
    Ebrahimi-Moghadam, Abbas
    WIRELESS PERSONAL COMMUNICATIONS, 2023, 129 (04) : 2525 - 2561