A spatio-temporal speech enhancement scheme for robust speech recognition in noisy environments

被引：27

作者：

Visser, E

Otsuka, M

Lee, TW

机构：

[1] Univ Calif San Diego, Inst Neural Computat, Dept 0523, La Jolla, CA 92093 USA

[2] DENSO Corp, Res Labs, Aichi 4700111, Japan

来源：

SPEECH COMMUNICATION | 2003年 / 41卷 / 2-3期

关键词：

speech enhancement; robust speech recognition; blind source separation; noisy environments;

D O I：

10.1016/S0167-6393(03)00010-4

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

A new speech enhancement scheme is presented integrating spatial and temporal signal processing methods for robust speech recognition in noisy environments. The scheme first separates spatially localized point sources from noisy speech signals recorded by two microphones. Blind source separation algorithms assuming no a priori knowledge about the sources involved are applied in this spatial processing stage. Then denoising of distributed background noise is achieved in a combined spatial/temporal processing approach. The desired speaker signal is first processed along with an artificially constructed noise signal in a supplementary blind source separation step. It is further denoised by exploiting differences in temporal speech and noise statistics in a wavelet filterbank. The scheme's performance is illustrated by speech recognition experiments on real recordings in a noisy car environment. In comparison to a common multi-microphone technique like beamforming with spectral subtraction, the scheme is shown to enable more accurate speech recognition in the presence of a highly interfering point source and strong background noise. (C) 2003 Elsevier B.V. All rights reserved.

引用

页码：393 / 407

页数：15

共 50 条

[1] Speech enhancement applied to speech recognition in noisy environments
Xu, Y.F., 2001, Press of Tsinghua University (41):
[2] Robust recognition of noisy speech using speech enhancement
Xu, YF
Zhang, JJ
Yao, KS
Cao, ZG
Ma, ZX
2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 734 - 737
[3] Spatio-temporal biologically inspired models for clean and noisy speech recognition
Ben Salem, Zouhour Neji
Boougrain, Laurent
Alexandre, Frederic
NEUROCOMPUTING, 2007, 71 (1-3) : 131 - 136
[4] A robust speech enhancement method in noisy environments
Abajaddi, Nesrine
Mounir, Badia
Elfahm, Youssef
Farchi, Abdelmajid
INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2023, 14 (09) : 973 - 983
[5] Speech enhancement strategy for speech recognition microcontroller under noisy environments
Chan, Kit Yan
Nordholm, Sven
Yiu, Ka Fai Cedric
Togneri, Roberto
NEUROCOMPUTING, 2013, 118 : 279 - 288
[6] Spatio-temporal processing for distant speech recognition
Low, SY
Togneri, R
Nordholm, S
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 1001 - 1004
[7] A robust endpoint detection of speech for noisy environments with application to automatic speech recognition
Bou-Ghazale, SE
Assaleh, K
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 3808 - 3811
[8] Spatio-temporal organization map: A speech recognition application
Ben Salem, ZN
Mouria-Beji, F
Kamoun, F
ARTIFICIAL NEURAL NETWORKS: BIOLOGICAL INSPIRATIONS - ICANN 2005, PT 1, PROCEEDINGS, 2005, 3696 : 371 - 378
[9] Noisy speech recognition based on speech enhancement
Wang, Xia
Tang, Hongmei
Zhao, Xiaoqun
SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 3, PROCEEDINGS, 2007, : 713 - +
[10] Robust Keyword Spotting for Noisy Environments by Leveraging Speech Enhancement and Speech Presence Probability
Yang, Chouchang
Saidutta, Yashas Malur
Srinivasa, Rakshith Sharma
Lee, Ching-Hua
Shen, Yilin
Jin, Hongxia
INTERSPEECH 2023, 2023, : 1638 - 1642

← 1 2 3 4 5 →