Enhancing the Recognition of Children's Speech on Acoustically Mismatched ASR System

被引:0
|
作者
Shahnawazuddin, S. [1 ]
Kathania, Hemant Kumar [2 ]
Sinha, Rohit [1 ]
机构
[1] Indian Inst Technol Guwahati, Dept Elect & Elect Engn, Gauhati 781039, India
[2] Natl Inst Technol Sikkim, Dept Elect & Commun Engn, Sikkim 737139, India
来源
TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE | 2015年
关键词
Children ASR; feature projection; adaptation;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The work presented in this paper explores the issues of recognizing children's speech using acoustic models trained on adults' speech data. In such conditions, on account of large acoustic mismatch between training and test data, a high degradation in the recognition performance is noted. In our earlier work, a binary weighting of cepstral features as well as of acoustic model parameters was explored to address the same. In this paper, a soft-weighting is proposed to overcome the information loss with simple binary weighting scheme. This is achieved through a low-rank projection learned using adults' training data. The so derived transform happens to emphasize the principal dimensions of acoustic variations in adults' speech. During testing, the transform maps children's test data to the space of the training data and thus suppresses the mismatched dimensions. The proposed scheme is also verified experimentally using a recognition system trained on adults' data only as well as another system trained using adults' and children's data pooled together. The effectiveness of acoustic model adaptation is also explored to further enhance the system performance. Combining SW with cluster model interpolation leads to a relative improvement of 14% over the baseline.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] ISI ASR System for the Low Resource Speech Recognition Challenge for Indian Languages
    Billa, Jayadev
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3207 - 3211
  • [22] Noise robust in-domain children speech enhancement for automatic Punjabi recognition system under mismatched conditions
    Bawa, Puneet
    Kadyan, Virender
    APPLIED ACOUSTICS, 2021, 175
  • [23] The FawAI ASR System for the ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge
    Sun, Yujia
    Ge, Bing
    Chen, Bo
    Fu, Zhen
    He, Jinxin
    Gao, Hongwei
    Wang, Xue
    2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 512 - 516
  • [24] AISPEECH-SJTU ASR SYSTEM FOR THE ACCENTED ENGLISH SPEECH RECOGNITION CHALLENGE
    Tan, Tian
    Lu, Yizhou
    Ma, Rao
    Zhu, Sen
    Guo, Jiaqi
    Qian, Yanmin
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6413 - 6417
  • [25] Investigating recognition of children's speech
    Giuliani, D
    Gerosa, M
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 137 - 140
  • [26] Robust recognition of children's speech
    Potamianos, A
    Narayan, S
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (06): : 603 - 616
  • [27] Speaker Recognition for Children's Speech
    Safavi, Saeid
    Najafian, Maryam
    Hanani, Abualsoud
    Russell, Martin
    Jancovic, Peter
    Carey, Michael
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1834 - 1837
  • [28] ASR emotional speech: Clarifying the issues and enhancing performance
    Athanaselis, T
    Bakamidis, S
    Dologlou, I
    Cowie, R
    Douglas-Cowie, E
    Cox, C
    NEURAL NETWORKS, 2005, 18 (04) : 437 - 444
  • [29] Some Experiments on Context Mismatched Speech Recognition
    Dey, Abhishek
    Shahnawazuddin, S.
    Sinha, Rohit
    2018 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM 2018), 2018, : 247 - 251
  • [30] Improving children's mismatched ASR using structured low-rank feature projection
    Shahnawazuddin, S.
    Kathania, Hemant K.
    Dey, Abhishek
    Sinha, Rohit
    SPEECH COMMUNICATION, 2018, 105 : 103 - 113