Local peak enhancement for in-car speech recognition in noisy environment

被引:1
|
作者
Ichikawa, Osamu [1 ]
Fukuda, Takashi [1 ]
Nishimura, Masafumi [1 ]
机构
[1] IBM Japan Ltd, Tokyo Res Lab, Yamato Shi 2428502, Japan
关键词
harmonics; formant; speech enhancement; noise reduction; speech recognition;
D O I
10.1093/ietisy/e91-d.3.635
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The accuracy of automatic speech recognition in a car is significantly degraded in a very low SNR (Signal to Noise Ratio) situation such as "Fan high" or "Window open". In such cases, speech signals are often buried in broadband noise. Although several existing noise reduction algorithms are known to improve the accuracy, other approaches that can work with them are still required for further improvement. One of the candidates is enhancement of the harmonic structures in human voices. However, most conventional approaches are based on comb filtering, and it is difficult to use them in practical situations, because their assumptions for F0 detection and for voiced/unvoiced detection are not accurate enough in realistic noisy environments. In this paper, we propose a new approach that does not rely on such detection. An observed power spectrum is directly converted into a filter for speech enhancement, by retaining only the local peaks considered to be harmonic structures in the human voice. In our experiments, this approach reduced the word error rate by 17% in realistic automobile environments. Also, it showed further improvement when used with existing noise reduction methods.
引用
收藏
页码:635 / 639
页数:5
相关论文
共 50 条
  • [21] Robust recognition of noisy speech using speech enhancement
    Xu, YF
    Zhang, JJ
    Yao, KS
    Cao, ZG
    Ma, ZX
    [J]. 2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 734 - 737
  • [22] Robust in-car speech recognition based on nonlinear multiple regressions
    Li, Weifeng
    Takeda, Kazuya
    Itakura, Fumitada
    [J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2007, 2007 (1)
  • [23] Energy contour enhancement for noisy speech recognition
    Hwang, TH
    Chang, SC
    [J]. 2004 International Symposium on Chinese Spoken Language Processing, Proceedings, 2004, : 249 - 252
  • [24] Simultaneous adaptation of echo cancellation and spectral subtraction for in-car speech recognition
    Ichikawa, O
    Nishimura, M
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2005, E88A (07) : 1732 - 1738
  • [25] Noisy Environment-Aware Speech Enhancement for Speech Recognition in Human-Robot Interaction Application
    Lee, Sheng-Chieh
    Chen, Bo-Wei
    Wang, Jhing-Fa
    [J]. IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2010), 2010,
  • [26] COMPARISON OF DIFFERENT SPEECH ENHANCEMENT METHODS ON RECOGNITION OF NOISY SPEECH
    AHMED, MS
    ALMARZOUG, AM
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 1994, 19 (01): : 45 - 56
  • [27] Feature Denoising Using Joint Sparse Representation for In-Car Speech Recognition
    Li, Weifeng
    Zhou, Yicong
    Poh, Norman
    Zhou, Fei
    Liao, Qingmin
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2013, 20 (07) : 681 - 684
  • [28] Switching linear dynamic models for noise robust in-car speech recognition
    Schuller, Bjoern
    Woellmer, Martin
    Moosmayr, Tobias
    Ruske, Guenther
    Rigoll, Gerhard
    [J]. PATTERN RECOGNITION, 2008, 5096 : 244 - +
  • [29] Non-linear Spectral Contrast Stretching for In-car Speech Recognition
    Li, Weifeng
    Bourlard, Herve
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1777 - 1780
  • [30] AUTOMATIC SPEECH RECOGNITION IN A NOISY AUTOMOTIVE ENVIRONMENT
    WILPON, JG
    RABINER, LR
    DEMARCO, D
    SHIPLEY, KL
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1987, 81 : S94 - S94