Comprehensive literature review on children automatic speech recognition system, acoustic linguistic mismatch approaches and challenges

被引:0
|
作者
Sobti, Rajni [1 ,2 ]
Guleria, Kalpna [1 ]
Kadyan, Virender [3 ]
机构
[1] Chitkara Univ, Chitkara Univ Inst Engn & Technol, Chandigarh, Punjab, India
[2] Panjab Univ, Univ Inst Engn & Technol, Chandigarh, India
[3] Univ Petr & Energy Studies, Speech & Language Res Ctr, Sch Comp Sci, Dehra Dun, Uttrakhand, India
关键词
Automatic Speech Recognition; Applications of Child ASRs; Data Augmentation; Acoustic and Linguistic Variations; Mismatch ASR; Low Resource Language; DATA AUGMENTATION; VARIABILITY; ADAPTATION; DOMAIN; ASR;
D O I
10.1007/s11042-024-18753-4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Automatic Speech Recognition (ASR) system for children is as important as for adults since children are more dependent on these systems nowadays, such as computer games, reading tutors, foreign language learning tools, etc. Consequently, this article aims to present several important aspects related to children's speech recognition systems, in which a comprehensive review is presented. Acoustic and linguistic challenges of children's speech are presented thoroughly to understand the basic anatomy of children's articulation organs. A variety of challenges exist for the development of children's ASR, such as the collection of children's speech data is a very complex task; the available child corpora are not publicly accessible, children's speakers differ greatly due to linguistic and acoustic variations, and ASRs developed for one age group are not suitable for another age group. All these challenges are systematically described in this article. Various data augmentation methods are also explored here, along with different approaches to develop ASR in children's speech. It has been observed that the inaccessibility of child corpora publicly is a significant barrier to children's ASR. Apart from the challenges mentioned earlier related to children's ASR, an attempt has been made to thoroughly review the children's ASR in the case of Punjabi language, as this language is ranked 10th most spoken globally and is still considered a low-resource language. Further, various approaches for the development of children's ASR such as traditional, hybrid and end-to-end (E2E) networks are also reported. In addition, an analytical summary and discussion are included.
引用
收藏
页码:81933 / 81995
页数:63
相关论文
共 50 条
  • [1] Adaptive feature truncation to address acoustic mismatch in automatic recognition of children's speech
    Ghai, Shweta
    Sinha, Rohit
    APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2016, 5 (05)
  • [2] Automatic Speech Recognition (ASR) Systems for Children: A Systematic Literature Review
    Bhardwaj, Vivek
    Ben Othman, Mohamed Tahar
    Kukreja, Vinay
    Belkhier, Youcef
    Bajaj, Mohit
    Goud, B. Srikanth
    Rehman, Ateeq Ur
    Shafiq, Muhammad
    Hamam, Habib
    APPLIED SCIENCES-BASEL, 2022, 12 (09):
  • [3] Automatic Speech Recognition System for Malay Speaking Children Automatic Speech Recognition system
    Rahman, Feisal Dani
    Mohamed, Noraini
    Mustafa, Mumtaz Begum
    Salim, Siti Salwah
    2014 THIRD ICT INTERNATIONAL STUDENT PROJECT CONFERENCE (ICT-ISPC), 2014, : 79 - 82
  • [4] A review on automatic speech recognition architecture and approaches
    Karpagavalli, S. (karpagavalli@psgrkc.com), 1600, Science and Engineering Research Support Society (09):
  • [5] Automatic Speech Recognition: Systematic Literature Review
    Alharbi, Sadeen
    Alrazgan, Muna
    Alrashed, Alanoud
    Alnomasi, Turkiayh
    Almojel, Raghad
    Alharbi, Rimah
    Alharbi, Saja
    Alturki, Sahar
    Alshehri, Fatimah
    Almojil, Maha
    IEEE ACCESS, 2021, 9 : 131858 - 131876
  • [6] Acoustic variability and automatic recognition of children's speech
    Gerosa, Matteo
    Giuliani, Diego
    Brugnara, Fabio
    SPEECH COMMUNICATION, 2007, 49 (10-11) : 847 - 860
  • [7] A review of the acoustic and linguistic properties of children's speech
    Potamianos, Alexandros
    Narayanan, Shrikanth
    2007 IEEE NINTH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 2007, : 22 - 25
  • [8] Automatic Speech Emotion Recognition: a Systematic Literature Review
    Mustafa H.H.
    Darwish N.R.
    Hefny H.A.
    International Journal of Speech Technology, 2024, 27 (1) : 267 - 285
  • [9] Arabic Automatic Speech Recognition: A Systematic Literature Review
    Dhouib, Amira
    Othman, Achraf
    El Ghoul, Oussama
    Khribi, Mohamed Koutheair
    Al Sinani, Aisha
    APPLIED SCIENCES-BASEL, 2022, 12 (17):
  • [10] Acoustic Analysis and Automatic Recognition of Spontaneous Children's Speech
    Gerosa, M.
    Giuliani, D.
    Narayanan, S.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1886 - +