A spatio-temporal speech enhancement scheme for robust speech recognition in noisy environments

被引:27
|
作者
Visser, E
Otsuka, M
Lee, TW
机构
[1] Univ Calif San Diego, Inst Neural Computat, Dept 0523, La Jolla, CA 92093 USA
[2] DENSO Corp, Res Labs, Aichi 4700111, Japan
关键词
speech enhancement; robust speech recognition; blind source separation; noisy environments;
D O I
10.1016/S0167-6393(03)00010-4
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A new speech enhancement scheme is presented integrating spatial and temporal signal processing methods for robust speech recognition in noisy environments. The scheme first separates spatially localized point sources from noisy speech signals recorded by two microphones. Blind source separation algorithms assuming no a priori knowledge about the sources involved are applied in this spatial processing stage. Then denoising of distributed background noise is achieved in a combined spatial/temporal processing approach. The desired speaker signal is first processed along with an artificially constructed noise signal in a supplementary blind source separation step. It is further denoised by exploiting differences in temporal speech and noise statistics in a wavelet filterbank. The scheme's performance is illustrated by speech recognition experiments on real recordings in a noisy car environment. In comparison to a common multi-microphone technique like beamforming with spectral subtraction, the scheme is shown to enable more accurate speech recognition in the presence of a highly interfering point source and strong background noise. (C) 2003 Elsevier B.V. All rights reserved.
引用
收藏
页码:393 / 407
页数:15
相关论文
共 50 条
  • [1] Speech enhancement applied to speech recognition in noisy environments
    [J]. Xu, Y.F, 2001, Press of Tsinghua University (41):
  • [2] Robust recognition of noisy speech using speech enhancement
    Xu, YF
    Zhang, JJ
    Yao, KS
    Cao, ZG
    Ma, ZX
    [J]. 2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 734 - 737
  • [3] Spatio-temporal biologically inspired models for clean and noisy speech recognition
    Ben Salem, Zouhour Neji
    Boougrain, Laurent
    Alexandre, Frederic
    [J]. NEUROCOMPUTING, 2007, 71 (1-3) : 131 - 136
  • [4] A robust speech enhancement method in noisy environments
    Abajaddi, Nesrine
    Mounir, Badia
    Elfahm, Youssef
    Farchi, Abdelmajid
    [J]. INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2023, 14 (09) : 973 - 983
  • [5] Speech enhancement strategy for speech recognition microcontroller under noisy environments
    Chan, Kit Yan
    Nordholm, Sven
    Yiu, Ka Fai Cedric
    Togneri, Roberto
    [J]. NEUROCOMPUTING, 2013, 118 : 279 - 288
  • [6] Spatio-temporal processing for distant speech recognition
    Low, SY
    Togneri, R
    Nordholm, S
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 1001 - 1004
  • [7] A robust endpoint detection of speech for noisy environments with application to automatic speech recognition
    Bou-Ghazale, SE
    Assaleh, K
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 3808 - 3811
  • [8] Noisy speech recognition based on speech enhancement
    Wang, Xia
    Tang, Hongmei
    Zhao, Xiaoqun
    [J]. SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 3, PROCEEDINGS, 2007, : 713 - +
  • [9] Spatio-temporal organization map: A speech recognition application
    Ben Salem, ZN
    Mouria-Beji, F
    Kamoun, F
    [J]. ARTIFICIAL NEURAL NETWORKS: BIOLOGICAL INSPIRATIONS - ICANN 2005, PT 1, PROCEEDINGS, 2005, 3696 : 371 - 378
  • [10] A robust feature extraction for automatic speech recognition in noisy environments
    Lima, C
    Almeida, LB
    Monteiro, JL
    [J]. 2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 540 - 543