Enhancing the Recognition of Children's Speech on Acoustically Mismatched ASR System

被引：0

作者：

Shahnawazuddin, S. ^{[1
]}

Kathania, Hemant Kumar ^{[2
]}

Sinha, Rohit ^{[1
]}

机构：

[1] Indian Inst Technol Guwahati, Dept Elect & Elect Engn, Gauhati 781039, India

[2] Natl Inst Technol Sikkim, Dept Elect & Commun Engn, Sikkim 737139, India

来源：

TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE | 2015年

关键词：

Children ASR; feature projection; adaptation;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The work presented in this paper explores the issues of recognizing children's speech using acoustic models trained on adults' speech data. In such conditions, on account of large acoustic mismatch between training and test data, a high degradation in the recognition performance is noted. In our earlier work, a binary weighting of cepstral features as well as of acoustic model parameters was explored to address the same. In this paper, a soft-weighting is proposed to overcome the information loss with simple binary weighting scheme. This is achieved through a low-rank projection learned using adults' training data. The so derived transform happens to emphasize the principal dimensions of acoustic variations in adults' speech. During testing, the transform maps children's test data to the space of the training data and thus suppresses the mismatched dimensions. The proposed scheme is also verified experimentally using a recognition system trained on adults' data only as well as another system trained using adults' and children's data pooled together. The effectiveness of acoustic model adaptation is also explored to further enhance the system performance. Combining SW with cluster model interpolation leads to a relative improvement of 14% over the baseline.

引用

页数：5

共 50 条

[41] The Impact of Face Mask and Emotion on Automatic Speech Recognition (ASR) and Speech Emotion Recognition (SER)
Oh, Qi Qi
Seow, Chee Kiat
Yusuff, Mulliana
Pranata, Sugiri
Cao, Qi
2023 8TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYTICS, ICCCBDA, 2023, : 523 - 531
[42] Developing children's speech recognition system for low resource Punjabi language
Kadyan, Virender
Shanawazuddin, Syed
Singh, Amitoj
APPLIED ACOUSTICS, 2021, 178
[43] Enhancing Amazigh Speech Recognition System with MFDWC-SVM
Abakarim, Fadwa
Abenaou, Abdenbi
COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2023, PT I, 2023, 13956 : 471 - 488
[44] Improvements in children's speech recognition performance
Das, S
Nis, D
Picheny, M
PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 433 - 436
[45] Transfer learning for children's speech recognition
Tong, Rong
Wang, Lei
Ma, Bin
2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 36 - 39
[46] Enhancing robustness of zero resource children's speech recognition system through bispectrum based front-end acoustic features
Shahnawazuddin, S.
Kumar, Avinash
Kumar, Saurabh
Ahmad, Waquar
DIGITAL SIGNAL PROCESSING, 2021, 118
[47] Can Audio-Visual Speech Recognition outperform Acoustically Enhanced Speech Recognition in Automotive Environment?
Navarathna, Rajitha
Kleinschmidt, Tristan
Dean, David
Sridharan, Sridha
Lucey, Patrick
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2252 - 2255
[48] ARABIC SPEECH PRONUNCIATION RECOGNITION AND CORRECTION USING AUTOMATIC SPEECH RECOGNIZER (ASR)
Dahan, H. B.
Mannan, A.
INTED2012: INTERNATIONAL TECHNOLOGY, EDUCATION AND DEVELOPMENT CONFERENCE, 2012, : 4009 - 4016
[49] Prosody modification for speech recognition in emotionally mismatched conditions
Vegesna, Vishnu Vidyadhara Raju
Gurugubelli, Krishna
Vuppala, Anil Kumar
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (03) : 521 - 532
[50] Road Navigation System Using Automatic Speech Recognition (ASR) And Natural Language Processing (NLP)
Withanage, Pooja
Liyanage, Tharaka
Deeyakaduwe, Naditha
Dias, Eshan
Thelijjagoda, Samantha
2018 IEEE REGION 10 HUMANITARIAN TECHNOLOGY CONFERENCE (R10-HTC), 2018,

← 1 2 3 4 5 →