An effective cluster-based model for robust speech detection and speech recognition in noisy environments

被引：21

作者：

Gorriz, J. M. ^{[1
]}

Ramirez, J.

Segura, J. C.

Puntonet, C. G.

机构：

[1] Univ Granada, Dept Signal Theory, Granada, Spain

[2] Univ Granada, Dept Comp Architecture & Technol, Granada, Spain

来源：

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA | 2006年 / 120卷 / 01期

关键词：

D O I：

10.1121/1.2208450

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper shows an accurate speech detection algorithm for improving the performance of speech recognition systems working in noisy environments. The proposed method is based on a hard decision clustering approach where a set of prototypes is used to characterize the noisy channel. Detecting the presence of speech is enabled by a decision rule formulated in terms of an averaged distance between the observation vector and a cluster-based noise model. The algorithm benefits from using contextual information, a strategy that considers not only a single speech frame but also a neighborhood of data in order to smooth the decision function and improve speech detection robustness. The proposed scheme exhibits reduced computational cost making it adequate for real time applications, i.e., automated speech recognition systems. An exhaustive analysis is conducted on the AURORA 2 and AURORA 3 databases in order to assess the performance of the algorithm and to compare it to existing standard voice activity detection (VAD) methods. The results show significant improvements in detection accuracy and speech recognition rate over standard VADs such as ITU-T G.729, ETSI GSM AMR, and ETSI AFE for distributed speech recognition and a representative set of recently reported VAD algorithms. (c) 2006 Acoustical Society of America.

引用

页码：470 / 481

页数：12

共 50 条

[1] An effective cluster-based model for robust speech detection and speech recognition in noisy environments
Górriz, J.M.
Ramírez, J.
Segura, J.C.
Puntonet, C.G.
Journal of the Acoustical Society of America, 2006, 120 (01): : 470 - 481
[2] A robust endpoint detection of speech for noisy environments with application to automatic speech recognition
Bou-Ghazale, SE
Assaleh, K
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 3808 - 3811
[3] Robust Speech Detection for Noisy Environments
Varela, Oscar
Indra, S. A.
San-Segundo, Ruben
Hernandez, Luis A.
IEEE AEROSPACE AND ELECTRONIC SYSTEMS MAGAZINE, 2011, 26 (11) : 16 - U12
[4] Linearized distortion model for robust speech recognition in noisy environments
He, Yong-Jun
Han, Ji-Qing
Tongxin Xuebao/Journal on Communications, 2010, 31 (09): : 8 - 14
[5] Auditory model for robust speech recognition in real world noisy environments
Kim, DS
Lee, SY
Kil, RM
Zhu, XL
ELECTRONICS LETTERS, 1997, 33 (01) : 12 - 13
[6] Robust speech/non-speech detection based on LDA-derived parameter and voicing parameter for speech recognition in noisy environments
Martin, A
Mauuary, L
SPEECH COMMUNICATION, 2006, 48 (02) : 191 - 206
[7] Speech enhancement method based on feature compensation gain for effective speech recognition in noisy environments
Bae, Ara
Kim, Wooil
JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2019, 38 (01): : 51 - 55
[8] Robust emotional speech recognition based on binaural model and emotional auditory mask in noisy environments
Bashirpour, Meysam
Geravanchizadeh, Masoud
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2018,
[9] Robust emotional speech recognition based on binaural model and emotional auditory mask in noisy environments
Meysam Bashirpour
Masoud Geravanchizadeh
EURASIP Journal on Audio, Speech, and Music Processing, 2018
[10] Noisy speech recognition based on robust end-point detection and model adaptation
Zhang, ZP
Furui, S
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 441 - 444

← 1 2 3 4 5 →