Enhancing the Recognition of Children's Speech on Acoustically Mismatched ASR System

被引:0
|
作者
Shahnawazuddin, S. [1 ]
Kathania, Hemant Kumar [2 ]
Sinha, Rohit [1 ]
机构
[1] Indian Inst Technol Guwahati, Dept Elect & Elect Engn, Gauhati 781039, India
[2] Natl Inst Technol Sikkim, Dept Elect & Commun Engn, Sikkim 737139, India
来源
TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE | 2015年
关键词
Children ASR; feature projection; adaptation;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The work presented in this paper explores the issues of recognizing children's speech using acoustic models trained on adults' speech data. In such conditions, on account of large acoustic mismatch between training and test data, a high degradation in the recognition performance is noted. In our earlier work, a binary weighting of cepstral features as well as of acoustic model parameters was explored to address the same. In this paper, a soft-weighting is proposed to overcome the information loss with simple binary weighting scheme. This is achieved through a low-rank projection learned using adults' training data. The so derived transform happens to emphasize the principal dimensions of acoustic variations in adults' speech. During testing, the transform maps children's test data to the space of the training data and thus suppresses the mismatched dimensions. The proposed scheme is also verified experimentally using a recognition system trained on adults' data only as well as another system trained using adults' and children's data pooled together. The effectiveness of acoustic model adaptation is also explored to further enhance the system performance. Combining SW with cluster model interpolation leads to a relative improvement of 14% over the baseline.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] The Impact of Face Mask and Emotion on Automatic Speech Recognition (ASR) and Speech Emotion Recognition (SER)
    Oh, Qi Qi
    Seow, Chee Kiat
    Yusuff, Mulliana
    Pranata, Sugiri
    Cao, Qi
    2023 8TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYTICS, ICCCBDA, 2023, : 523 - 531
  • [42] Developing children's speech recognition system for low resource Punjabi language
    Kadyan, Virender
    Shanawazuddin, Syed
    Singh, Amitoj
    APPLIED ACOUSTICS, 2021, 178
  • [43] Enhancing Amazigh Speech Recognition System with MFDWC-SVM
    Abakarim, Fadwa
    Abenaou, Abdenbi
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2023, PT I, 2023, 13956 : 471 - 488
  • [44] Improvements in children's speech recognition performance
    Das, S
    Nis, D
    Picheny, M
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 433 - 436
  • [45] Transfer learning for children's speech recognition
    Tong, Rong
    Wang, Lei
    Ma, Bin
    2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 36 - 39
  • [46] Enhancing robustness of zero resource children's speech recognition system through bispectrum based front-end acoustic features
    Shahnawazuddin, S.
    Kumar, Avinash
    Kumar, Saurabh
    Ahmad, Waquar
    DIGITAL SIGNAL PROCESSING, 2021, 118
  • [47] Can Audio-Visual Speech Recognition outperform Acoustically Enhanced Speech Recognition in Automotive Environment?
    Navarathna, Rajitha
    Kleinschmidt, Tristan
    Dean, David
    Sridharan, Sridha
    Lucey, Patrick
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2252 - 2255
  • [48] ARABIC SPEECH PRONUNCIATION RECOGNITION AND CORRECTION USING AUTOMATIC SPEECH RECOGNIZER (ASR)
    Dahan, H. B.
    Mannan, A.
    INTED2012: INTERNATIONAL TECHNOLOGY, EDUCATION AND DEVELOPMENT CONFERENCE, 2012, : 4009 - 4016
  • [49] Prosody modification for speech recognition in emotionally mismatched conditions
    Vegesna, Vishnu Vidyadhara Raju
    Gurugubelli, Krishna
    Vuppala, Anil Kumar
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (03) : 521 - 532
  • [50] Road Navigation System Using Automatic Speech Recognition (ASR) And Natural Language Processing (NLP)
    Withanage, Pooja
    Liyanage, Tharaka
    Deeyakaduwe, Naditha
    Dias, Eshan
    Thelijjagoda, Samantha
    2018 IEEE REGION 10 HUMANITARIAN TECHNOLOGY CONFERENCE (R10-HTC), 2018,