A spatio-temporal speech enhancement scheme for robust speech recognition in noisy environments

被引:27
|
作者
Visser, E
Otsuka, M
Lee, TW
机构
[1] Univ Calif San Diego, Inst Neural Computat, Dept 0523, La Jolla, CA 92093 USA
[2] DENSO Corp, Res Labs, Aichi 4700111, Japan
关键词
speech enhancement; robust speech recognition; blind source separation; noisy environments;
D O I
10.1016/S0167-6393(03)00010-4
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A new speech enhancement scheme is presented integrating spatial and temporal signal processing methods for robust speech recognition in noisy environments. The scheme first separates spatially localized point sources from noisy speech signals recorded by two microphones. Blind source separation algorithms assuming no a priori knowledge about the sources involved are applied in this spatial processing stage. Then denoising of distributed background noise is achieved in a combined spatial/temporal processing approach. The desired speaker signal is first processed along with an artificially constructed noise signal in a supplementary blind source separation step. It is further denoised by exploiting differences in temporal speech and noise statistics in a wavelet filterbank. The scheme's performance is illustrated by speech recognition experiments on real recordings in a noisy car environment. In comparison to a common multi-microphone technique like beamforming with spectral subtraction, the scheme is shown to enable more accurate speech recognition in the presence of a highly interfering point source and strong background noise. (C) 2003 Elsevier B.V. All rights reserved.
引用
收藏
页码:393 / 407
页数:15
相关论文
共 50 条
  • [31] COMPARISON OF DIFFERENT SPEECH ENHANCEMENT METHODS ON RECOGNITION OF NOISY SPEECH
    AHMED, MS
    ALMARZOUG, AM
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 1994, 19 (01): : 45 - 56
  • [32] Speech Enhancement Algorithm Based on a Convolutional Neural Network Reconstruction of the Temporal Envelope of Speech in Noisy Environments
    Soleymanpour, Rahim
    Soleymanpour, Mohammad
    Brammer, Anthony J.
    Johnson, Michael T.
    Kim, Insoo
    IEEE ACCESS, 2023, 11 : 5328 - 5336
  • [33] Comparison between two spatio-temporal organization maps for speech recognition
    Ben Salem, Zouhour Neji
    Bougrain, Laurent
    Alexandre, Frederic
    ARTIFICIAL NEURAL NETWORKS IN PATTERN RECOGNITION, PROCEEDINGS, 2006, 4087 : 11 - 20
  • [34] Enhancement of Reverberant Speech in Noisy Acoustical Environments
    Joorabchi, Marjan
    Ghorshi, Seyed
    Sarafnia, Ali
    2014 SIXTH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP), 2014,
  • [35] SPEECH RECOGNITION WITH NO SPEECH OR WITH NOISY SPEECH
    Krishna, Gautam
    Co Tran
    Yu, Jianguo
    Tewfik, Ahmed H.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 1090 - 1094
  • [36] Speech Emotion Recognition in Noisy and Reverberant Environments
    Heracleous, Panikos
    Yasuda, Keiji
    Sugaya, Fumiaki
    Yoneyama, Akio
    Hashimoto, Masayuki
    2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2017, : 262 - 266
  • [37] Multisensory benefits for speech recognition in noisy environments
    Oh, Yonghee
    Schwalm, Meg
    Kalpin, Nicole
    FRONTIERS IN NEUROSCIENCE, 2022, 16
  • [38] Speech Recognition On Mobile Devices In Noisy Environments
    Yurtcan, Yaser
    Kilic, Banu Gunel
    2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
  • [39] Robust speech recognition in noisy environments: The 2001 IBM SPINE evaluation system
    Kingsbury, B
    Saon, G
    Mangu, L
    Padmanabhan, M
    Sarikaya, R
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 53 - 56
  • [40] Robust speech recognition in noisy environments based on subband spectral centroid histograms
    Gajic, B
    Paliwal, KK
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (02): : 600 - 608