Local peak enhancement for in-car speech recognition in noisy environment

被引:1
|
作者
Ichikawa, Osamu [1 ]
Fukuda, Takashi [1 ]
Nishimura, Masafumi [1 ]
机构
[1] IBM Japan Ltd, Tokyo Res Lab, Yamato Shi 2428502, Japan
关键词
harmonics; formant; speech enhancement; noise reduction; speech recognition;
D O I
10.1093/ietisy/e91-d.3.635
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The accuracy of automatic speech recognition in a car is significantly degraded in a very low SNR (Signal to Noise Ratio) situation such as "Fan high" or "Window open". In such cases, speech signals are often buried in broadband noise. Although several existing noise reduction algorithms are known to improve the accuracy, other approaches that can work with them are still required for further improvement. One of the candidates is enhancement of the harmonic structures in human voices. However, most conventional approaches are based on comb filtering, and it is difficult to use them in practical situations, because their assumptions for F0 detection and for voiced/unvoiced detection are not accurate enough in realistic noisy environments. In this paper, we propose a new approach that does not rely on such detection. An observed power spectrum is directly converted into a filter for speech enhancement, by retaining only the local peaks considered to be harmonic structures in the human voice. In our experiments, this approach reduced the word error rate by 17% in realistic automobile environments. Also, it showed further improvement when used with existing noise reduction methods.
引用
收藏
页码:635 / 639
页数:5
相关论文
共 50 条
  • [1] SPEECH RECOGNITION IN THE NOISY CAR ENVIRONMENT
    RUEHL, HW
    DOBLER, S
    WEITH, J
    MEYER, P
    NOLL, A
    HAMER, HH
    PIOTROWSKI, H
    [J]. SPEECH COMMUNICATION, 1991, 10 (01) : 11 - 22
  • [2] FPGA Implementation of Spectral Subtraction for In-Car Speech Enhancement and Recognition
    Whittington, Jim
    Deo, Kapeel
    Kleinschmidt, Tristan
    Mason, Michael
    [J]. ICSPCS: 2ND INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, PROCEEDINGS, 2008, : 393 - +
  • [3] Design and implementation of subspace-based speech enhancement under in-car noisy environments
    Yang, Chung-Hsien
    Wang, Jia-Ching
    Wang, Jhing-Fa
    Wu, Chung-Hsien
    Chang, Kai-Hsing
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2008, 57 (03) : 1466 - 1479
  • [4] SPEECH INTELLIGIBILITY ENHANCEMENT BY EQUALIZATION FOR IN-CAR APPLICATIONS
    Gentet, Enguerrand
    David, Bertrand
    Denjean, Sebastien
    Richard, Gael
    Roussarie, Vincent
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6934 - 6938
  • [5] A SUBBAND HYBRID BEAMFORMING FOR IN-CAR SPEECH ENHANCEMENT
    Fox, Charles
    Vitte, Guillaume
    Charbit, Maurice
    Prado, Jacques
    Badeau, Roland
    David, Bertrand
    [J]. 2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 11 - 15
  • [6] An In-Car Speech Recognition System for Disabled Drivers
    Ivanecky, Jozef
    Mehlhase, Stephan
    [J]. TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 505 - 512
  • [7] Evaluation of Interface and In-Car Speech - Many Undesirable Utterances and Sever Noisy Speech on Car Navigation Application -
    Hataoka, Nobuo
    Araki, Manabu
    Matsuda, Takashi
    Takahashi, Masayuki
    Ohtaki, Ryoichi
    Obuchi, Yasunari
    [J]. 2008 IEEE 10TH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, VOLS 1 AND 2, 2008, : 960 - +
  • [8] In-car speech recognition using distributed multiple microphones
    Li, WF
    Nishino, T
    Miyajima, C
    Itou, K
    Takeda, K
    Itakura, F
    [J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2004, PT 1, PROCEEDINGS, 2004, 3331 : 505 - 513
  • [9] Noisy speech recognition based on speech enhancement
    Wang, Xia
    Tang, Hongmei
    Zhao, Xiaoqun
    [J]. SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 3, PROCEEDINGS, 2007, : 713 - +
  • [10] Adaptive regression based framework for in-car speech recognition
    Li, Weifeng
    Itou, Katunobu
    Takeda, Kazuya
    Itakura, Fumitada
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 501 - 504