Comprehensive literature review on children automatic speech recognition system, acoustic linguistic mismatch approaches and challenges

被引：0

作者：

Sobti, Rajni ^{[1
,2
]}

Guleria, Kalpna ^{[1
]}

Kadyan, Virender ^{[3
]}

机构：

[1] Chitkara Univ, Chitkara Univ Inst Engn & Technol, Chandigarh, Punjab, India

[2] Panjab Univ, Univ Inst Engn & Technol, Chandigarh, India

[3] Univ Petr & Energy Studies, Speech & Language Res Ctr, Sch Comp Sci, Dehra Dun, Uttrakhand, India

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2024年 / 83卷 / 35期

关键词：

Automatic Speech Recognition; Applications of Child ASRs; Data Augmentation; Acoustic and Linguistic Variations; Mismatch ASR; Low Resource Language; DATA AUGMENTATION; VARIABILITY; ADAPTATION; DOMAIN; ASR;

D O I：

10.1007/s11042-024-18753-4

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Automatic Speech Recognition (ASR) system for children is as important as for adults since children are more dependent on these systems nowadays, such as computer games, reading tutors, foreign language learning tools, etc. Consequently, this article aims to present several important aspects related to children's speech recognition systems, in which a comprehensive review is presented. Acoustic and linguistic challenges of children's speech are presented thoroughly to understand the basic anatomy of children's articulation organs. A variety of challenges exist for the development of children's ASR, such as the collection of children's speech data is a very complex task; the available child corpora are not publicly accessible, children's speakers differ greatly due to linguistic and acoustic variations, and ASRs developed for one age group are not suitable for another age group. All these challenges are systematically described in this article. Various data augmentation methods are also explored here, along with different approaches to develop ASR in children's speech. It has been observed that the inaccessibility of child corpora publicly is a significant barrier to children's ASR. Apart from the challenges mentioned earlier related to children's ASR, an attempt has been made to thoroughly review the children's ASR in the case of Punjabi language, as this language is ranked 10th most spoken globally and is still considered a low-resource language. Further, various approaches for the development of children's ASR such as traditional, hybrid and end-to-end (E2E) networks are also reported. In addition, an analytical summary and discussion are included.

引用

页码：81933 / 81995

页数：63

共 50 条

[1] Adaptive feature truncation to address acoustic mismatch in automatic recognition of children's speech
Ghai, Shweta
Sinha, Rohit
APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2016, 5 (05)
[2] Automatic Speech Recognition (ASR) Systems for Children: A Systematic Literature Review
Bhardwaj, Vivek
Ben Othman, Mohamed Tahar
Kukreja, Vinay
Belkhier, Youcef
Bajaj, Mohit
Goud, B. Srikanth
Rehman, Ateeq Ur
Shafiq, Muhammad
Hamam, Habib
APPLIED SCIENCES-BASEL, 2022, 12 (09):
[3] Automatic Speech Recognition System for Malay Speaking Children Automatic Speech Recognition system
Rahman, Feisal Dani
Mohamed, Noraini
Mustafa, Mumtaz Begum
Salim, Siti Salwah
2014 THIRD ICT INTERNATIONAL STUDENT PROJECT CONFERENCE (ICT-ISPC), 2014, : 79 - 82
[4] A review on automatic speech recognition architecture and approaches
Karpagavalli, S. (karpagavalli@psgrkc.com), 1600, Science and Engineering Research Support Society (09):
[5] Automatic Speech Recognition: Systematic Literature Review
Alharbi, Sadeen
Alrazgan, Muna
Alrashed, Alanoud
Alnomasi, Turkiayh
Almojel, Raghad
Alharbi, Rimah
Alharbi, Saja
Alturki, Sahar
Alshehri, Fatimah
Almojil, Maha
IEEE ACCESS, 2021, 9 : 131858 - 131876
[6] Acoustic variability and automatic recognition of children's speech
Gerosa, Matteo
Giuliani, Diego
Brugnara, Fabio
SPEECH COMMUNICATION, 2007, 49 (10-11) : 847 - 860
[7] A review of the acoustic and linguistic properties of children's speech
Potamianos, Alexandros
Narayanan, Shrikanth
2007 IEEE NINTH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 2007, : 22 - 25
[8] Automatic Speech Emotion Recognition: a Systematic Literature Review
Mustafa H.H.
Darwish N.R.
Hefny H.A.
International Journal of Speech Technology, 2024, 27 (1) : 267 - 285
[9] Arabic Automatic Speech Recognition: A Systematic Literature Review
Dhouib, Amira
Othman, Achraf
El Ghoul, Oussama
Khribi, Mohamed Koutheair
Al Sinani, Aisha
APPLIED SCIENCES-BASEL, 2022, 12 (17):
[10] Acoustic Analysis and Automatic Recognition of Spontaneous Children's Speech
Gerosa, M.
Giuliani, D.
Narayanan, S.
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1886 - +

← 1 2 3 4 5 →