Enhancing the Recognition of Children's Speech on Acoustically Mismatched ASR System

被引:0
|
作者
Shahnawazuddin, S. [1 ]
Kathania, Hemant Kumar [2 ]
Sinha, Rohit [1 ]
机构
[1] Indian Inst Technol Guwahati, Dept Elect & Elect Engn, Gauhati 781039, India
[2] Natl Inst Technol Sikkim, Dept Elect & Commun Engn, Sikkim 737139, India
关键词
Children ASR; feature projection; adaptation;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The work presented in this paper explores the issues of recognizing children's speech using acoustic models trained on adults' speech data. In such conditions, on account of large acoustic mismatch between training and test data, a high degradation in the recognition performance is noted. In our earlier work, a binary weighting of cepstral features as well as of acoustic model parameters was explored to address the same. In this paper, a soft-weighting is proposed to overcome the information loss with simple binary weighting scheme. This is achieved through a low-rank projection learned using adults' training data. The so derived transform happens to emphasize the principal dimensions of acoustic variations in adults' speech. During testing, the transform maps children's test data to the space of the training data and thus suppresses the mismatched dimensions. The proposed scheme is also verified experimentally using a recognition system trained on adults' data only as well as another system trained using adults' and children's data pooled together. The effectiveness of acoustic model adaptation is also explored to further enhance the system performance. Combining SW with cluster model interpolation leads to a relative improvement of 14% over the baseline.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Experiments on Children's Speech Recognition under Acoustically Mismatched Conditions
    Kathania, Hemant Kumar
    Shahnawazuddin, S.
    Pradhan, Gayadhar
    Samaddar, A. B.
    PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), 2016, : 3014 - 3017
  • [2] Low-memory Fast On-line Adaptation for Acoustically Mismatched Children's Speech Recognition
    Shahnawazuddin, S.
    Sinha, Rohit
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1630 - 1634
  • [3] Enhancing Children's Speech Recognition under Mismatched Condition by Explicit Acoustic Normalization
    Ghai, Shweta
    Sinha, Rohit
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 522 - 525
  • [4] Improving Children Speech Recognition in Acoustically Mismatched Condition using Eigenvoices and Feature Projections
    Kathania, Hemant Kumar
    Shahnawazuddin, S.
    Sinha, Rohit
    2017 TWENTY-THIRD NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2017,
  • [5] On the Development of Matched and Mismatched Italian Children's Speech Recognition Systems
    Cosi, Piero
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 576 - 579
  • [6] THE EFFECT OF CHANGES IN SPEECH FEATURES ON THE RECOGNITION ACCURACY OF ASR SYSTEM: A STUDY ON THE MALAY SPEECH IMPAIRED CHILDREN
    Rosdi, F.
    Mustafa, M. B.
    Salim, S. S.
    Hamid, B. A.
    MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2017, 30 (01) : 48 - 62
  • [7] Pitch adaptive MFCC features for improving children's mismatched ASR
    Ghai, Shweta
    Sinha, Rohit
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2015, 18 (03) : 489 - 503
  • [8] Automatic speech recognition (ASR) for the diagnosis of pronunciation of speech sound disorders in Korean children
    Ahn, Taekyung
    Hong, Yeonjung
    Im, Younggon
    Kim, Do Hyung
    Kang, Dayoung
    Jeong, Joo Won
    Kim, Jae Won
    Kim, Min Jung
    Cho, Ah-Ra
    Nam, Hosung
    Jang, Dae-Hyun
    CLINICAL LINGUISTICS & PHONETICS, 2024,
  • [9] Automatic Speech Recognition (ASR) Systems for Children: A Systematic Literature Review
    Bhardwaj, Vivek
    Ben Othman, Mohamed Tahar
    Kukreja, Vinay
    Belkhier, Youcef
    Bajaj, Mohit
    Goud, B. Srikanth
    Rehman, Ateeq Ur
    Shafiq, Muhammad
    Hamam, Habib
    APPLIED SCIENCES-BASEL, 2022, 12 (09):
  • [10] ENHANCING NOISE AND PITCH ROBUSTNESS OF CHILDREN'S ASR
    Shahnawazuddin, S.
    Deepak, K. T.
    Pradhan, Gayadhar
    Sinha, Rohit
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5225 - 5229