Enhancing the Recognition of Children's Speech on Acoustically Mismatched ASR System

被引:0
|
作者
Shahnawazuddin, S. [1 ]
Kathania, Hemant Kumar [2 ]
Sinha, Rohit [1 ]
机构
[1] Indian Inst Technol Guwahati, Dept Elect & Elect Engn, Gauhati 781039, India
[2] Natl Inst Technol Sikkim, Dept Elect & Commun Engn, Sikkim 737139, India
来源
TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE | 2015年
关键词
Children ASR; feature projection; adaptation;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The work presented in this paper explores the issues of recognizing children's speech using acoustic models trained on adults' speech data. In such conditions, on account of large acoustic mismatch between training and test data, a high degradation in the recognition performance is noted. In our earlier work, a binary weighting of cepstral features as well as of acoustic model parameters was explored to address the same. In this paper, a soft-weighting is proposed to overcome the information loss with simple binary weighting scheme. This is achieved through a low-rank projection learned using adults' training data. The so derived transform happens to emphasize the principal dimensions of acoustic variations in adults' speech. During testing, the transform maps children's test data to the space of the training data and thus suppresses the mismatched dimensions. The proposed scheme is also verified experimentally using a recognition system trained on adults' data only as well as another system trained using adults' and children's data pooled together. The effectiveness of acoustic model adaptation is also explored to further enhance the system performance. Combining SW with cluster model interpolation leads to a relative improvement of 14% over the baseline.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Exploring the Role of Pitch-Adaptive Cepstral Features in Context of Children's Mismatched ASR
    Sinha, Rohit
    Shahnawazuddin, S.
    Karthik, Patri Satya
    2016 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2016,
  • [32] Speech recognition based on acoustically derived segment units
    Fukada, T
    Bacchiani, M
    Paliwal, KK
    Sagisaka, Y
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1077 - 1080
  • [33] Soft-Weighting Technique for Robust Children Speech Recognition under Mismatched Condition
    Kathania, Hemant Kumar
    Ghai, Shweta
    Sinha, Rohit
    2013 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2013,
  • [34] Isolated Word Automatic Speech Recognition (ASR) System using MFCC, DTW & KNN
    Imtiaz, Muhammad Atif
    Raja, Gulistan
    2016 ASIA PACIFIC CONFERENCE ON MULTIMEDIA AND BROADCASTING (APMEDIACAST), 2016, : 106 - 110
  • [35] THE FAWAISPEECH SYSTEM FOR MULTI-CHANNEL SPEECH RECOGNITION IN ICMC-ASR CHALLENGE
    Sun, Yujia
    He, Jinxin
    Zhang, Yi
    Liang, Xiaoming
    Wang, Ziyan
    Fu, Zhen
    Chen, Bo
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 19 - 20
  • [36] EMOTION RECOGNITION FROM SPEECH: PUTTING ASR IN THE LOOP
    Schuller, Bjoern
    Batliner, Anton
    Steidl, Stefan
    Seppi, Dino
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4585 - +
  • [37] Low-Cost Training of Speech Recognition System for Hindi ASR Challenge 2022
    Zatvornitskiy, Alexander
    SPEECH AND COMPUTER, SPECOM 2022, 2022, 13721 : 712 - 718
  • [38] A KALDI-DNN-based ASR system for Italian Experiments on Children Speech
    Cosi, Piero
    2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [39] Adversarial Attacks on Automatic Speech Recognition (ASR): A Survey
    Bhanushali, Amisha Rajnikant
    Mun, Hyunjun
    Yun, Joobeom
    IEEE ACCESS, 2024, 12 : 88279 - 88302
  • [40] Amazigh speech recognition based on the Kaldi ASR toolkit
    Barkani F.
    Hamidi M.
    Laaidi N.
    Zealouk O.
    Satori H.
    Satori K.
    International Journal of Information Technology, 2023, 15 (7) : 3533 - 3540