Robust technologies towards automatic speech recognition in car noise environments

被引:0
|
作者
Ding, Pei [1 ]
He, Lei [1 ]
Yan, Xiang [1 ]
Zhao, Rui [1 ]
Hao, Jie [1 ]
机构
[1] Toshiba Res & Dev Ctr, Beijing, Peoples R China
关键词
robust speech recognition; in-car noise; speech enhancement; spectrum smoothing; immunity learning;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents the research on robust automatic speech recognition (ASR) in car noise environments. In the front-end design, speech enhancement technologies are used to suppress the background noise in frequency domain, and then spectrum smoothing is implemented both in time and frequency index to compensate those spectrum components distorted by noise over-reduction. In acoustic model training, we propose to use an immunity teaming scheme, in which pre-recorded car noises are artificially added to clean training utterances with different signal-to-noise ratios (SNR) to imitate the in-car environments. After analyzing the SNR and, noise spectrum of real in-car utterances, we further refine the immunity training set by adjusting the distribution of SNR and increasing the proportion of training noises that has a similar characteristic. Evaluation results of isolated phrase recognition show that the ASR system with proposed technologies achieves the average error rate reduction (ERR) of 90.68% and 79.08% for artificial car noisy speech and real in-car speech respectively, when compared with the baseline system in which no robust technology is used.
引用
收藏
页码:776 / +
页数:2
相关论文
共 50 条
  • [31] A companding front end for noise-robust automatic speech recognition
    Guinness, J
    Raj, B
    Schmidt-Nielsen, B
    Turicchia, L
    Sarpeshkar, R
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 249 - 252
  • [32] Noise-Robust Algorithm of Speech Features Extraction for Automatic Speech Recognition System
    Yakhnev, A. N.
    Pisarev, A. S.
    PROCEEDINGS OF THE XIX IEEE INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND MEASUREMENTS (SCM 2016), 2016, : 206 - 208
  • [33] Teager energy based feature parameters for robust speech recognition in car noise
    Jabloun, Firas
    Cetin, A.Enis
    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 1999, 1 : 273 - 276
  • [34] The Teager Energy based feature parameters for robust speech recognition in car noise
    Jabloun, F
    Çetin, AE
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 273 - 276
  • [35] Switching linear dynamic models for noise robust in-car speech recognition
    Schuller, Bjoern
    Woellmer, Martin
    Moosmayr, Tobias
    Ruske, Guenther
    Rigoll, Gerhard
    PATTERN RECOGNITION, 2008, 5096 : 244 - +
  • [36] Towards More Robust Automatic Facial Expression Recognition in Smart Environments
    Bernin, Arne
    Mueller, Larissa
    Ghose, Sobin
    von Luck, Kai
    Grecos, Christos
    Wang, Qi
    Vogt, Florian
    10TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS (PETRA 2017), 2017, : 37 - 44
  • [37] Automatic speech recognition in adverse environments
    Baber, C
    HUMAN FACTORS, 1996, 38 (01) : 142 - 155
  • [38] Noise Robust Speech Features for Automatic Continuous Speech Recognition using Running Spectrum Analysis
    Ohnuki, Kazunaga
    Takahashi, Wataru
    Yoshizawa, Shingo
    Miyanaga, Yoshikazu
    2008 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES, 2008, : 150 - 153
  • [39] Sparse coding of the modulation spectrum for noise-robust automatic speech recognition
    Sara Ahmadi
    Seyed Mohammad Ahadi
    Bert Cranen
    Lou Boves
    EURASIP Journal on Audio, Speech, and Music Processing, 2014
  • [40] Mixtures of Bayesian Joint Factor Analyzers for Noise Robust Automatic Speech Recognition
    Cui, Xiaodong
    Goel, Vaibhava
    Kingsbury, Brian
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3011 - 3015