Enhancing the Recognition of Children's Speech on Acoustically Mismatched ASR System

被引：0

作者：

Shahnawazuddin, S. ^{[1
]}

Kathania, Hemant Kumar ^{[2
]}

Sinha, Rohit ^{[1
]}

机构：

[1] Indian Inst Technol Guwahati, Dept Elect & Elect Engn, Gauhati 781039, India

[2] Natl Inst Technol Sikkim, Dept Elect & Commun Engn, Sikkim 737139, India

来源：

TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE | 2015年

关键词：

Children ASR; feature projection; adaptation;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The work presented in this paper explores the issues of recognizing children's speech using acoustic models trained on adults' speech data. In such conditions, on account of large acoustic mismatch between training and test data, a high degradation in the recognition performance is noted. In our earlier work, a binary weighting of cepstral features as well as of acoustic model parameters was explored to address the same. In this paper, a soft-weighting is proposed to overcome the information loss with simple binary weighting scheme. This is achieved through a low-rank projection learned using adults' training data. The so derived transform happens to emphasize the principal dimensions of acoustic variations in adults' speech. During testing, the transform maps children's test data to the space of the training data and thus suppresses the mismatched dimensions. The proposed scheme is also verified experimentally using a recognition system trained on adults' data only as well as another system trained using adults' and children's data pooled together. The effectiveness of acoustic model adaptation is also explored to further enhance the system performance. Combining SW with cluster model interpolation leads to a relative improvement of 14% over the baseline.

引用

页数：5

共 50 条

[1] Experiments on Children's Speech Recognition under Acoustically Mismatched Conditions
Kathania, Hemant Kumar
Shahnawazuddin, S.
Pradhan, Gayadhar
Samaddar, A. B.
PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), 2016, : 3014 - 3017
[2] Low-memory Fast On-line Adaptation for Acoustically Mismatched Children's Speech Recognition
Shahnawazuddin, S.
Sinha, Rohit
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1630 - 1634
[3] Enhancing Children's Speech Recognition under Mismatched Condition by Explicit Acoustic Normalization
Ghai, Shweta
Sinha, Rohit
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 522 - 525
[4] Improving Children Speech Recognition in Acoustically Mismatched Condition using Eigenvoices and Feature Projections
Kathania, Hemant Kumar
Shahnawazuddin, S.
Sinha, Rohit
2017 TWENTY-THIRD NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2017,
[5] On the Development of Matched and Mismatched Italian Children's Speech Recognition Systems
Cosi, Piero
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 576 - 579
[6] THE EFFECT OF CHANGES IN SPEECH FEATURES ON THE RECOGNITION ACCURACY OF ASR SYSTEM: A STUDY ON THE MALAY SPEECH IMPAIRED CHILDREN
Rosdi, F.
Mustafa, M. B.
Salim, S. S.
Hamid, B. A.
MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2017, 30 (01) : 48 - 62
[7] Pitch adaptive MFCC features for improving children's mismatched ASR
Ghai, Shweta
Sinha, Rohit
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2015, 18 (03) : 489 - 503
[8] Automatic speech recognition (ASR) for the diagnosis of pronunciation of speech sound disorders in Korean children
Ahn, Taekyung
Hong, Yeonjung
Im, Younggon
Kim, Do Hyung
Kang, Dayoung
Jeong, Joo Won
Kim, Jae Won
Kim, Min Jung
Cho, Ah-Ra
Nam, Hosung
Jang, Dae-Hyun
CLINICAL LINGUISTICS & PHONETICS, 2024,
[9] Automatic Speech Recognition (ASR) Systems for Children: A Systematic Literature Review
Bhardwaj, Vivek
Ben Othman, Mohamed Tahar
Kukreja, Vinay
Belkhier, Youcef
Bajaj, Mohit
Goud, B. Srikanth
Rehman, Ateeq Ur
Shafiq, Muhammad
Hamam, Habib
APPLIED SCIENCES-BASEL, 2022, 12 (09):
[10] ENHANCING NOISE AND PITCH ROBUSTNESS OF CHILDREN'S ASR
Shahnawazuddin, S.
Deepak, K. T.
Pradhan, Gayadhar
Sinha, Rohit
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5225 - 5229

← 1 2 3 4 5 →