An effective cluster-based model for robust speech detection and speech recognition in noisy environments

被引：21

作者：

Gorriz, J. M. ^{[1
]}

Ramirez, J.

Segura, J. C.

Puntonet, C. G.

机构：

[1] Univ Granada, Dept Signal Theory, Granada, Spain

[2] Univ Granada, Dept Comp Architecture & Technol, Granada, Spain

来源：

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA | 2006年 / 120卷 / 01期

关键词：

D O I：

10.1121/1.2208450

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper shows an accurate speech detection algorithm for improving the performance of speech recognition systems working in noisy environments. The proposed method is based on a hard decision clustering approach where a set of prototypes is used to characterize the noisy channel. Detecting the presence of speech is enabled by a decision rule formulated in terms of an averaged distance between the observation vector and a cluster-based noise model. The algorithm benefits from using contextual information, a strategy that considers not only a single speech frame but also a neighborhood of data in order to smooth the decision function and improve speech detection robustness. The proposed scheme exhibits reduced computational cost making it adequate for real time applications, i.e., automated speech recognition systems. An exhaustive analysis is conducted on the AURORA 2 and AURORA 3 databases in order to assess the performance of the algorithm and to compare it to existing standard voice activity detection (VAD) methods. The results show significant improvements in detection accuracy and speech recognition rate over standard VADs such as ITU-T G.729, ETSI GSM AMR, and ETSI AFE for distributed speech recognition and a representative set of recently reported VAD algorithms. (c) 2006 Acoustical Society of America.

引用

页码：470 / 481

页数：12

共 50 条

[21] Speech Emotion Recognition Based on EMD in Noisy Environments
Chu, Yunyun
Xiong, Weihua
Chen, Wei
ADVANCES IN CIVIL ENGINEERING AND BUILDING MATERIALS III, 2014, 831 : 460 - 464
[22] SPEECH RECOGNITION IN NOISY ENVIRONMENTS - A SURVEY
GONG, YF
SPEECH COMMUNICATION, 1995, 16 (03) : 261 - 291
[23] Auditory processing of speech signals for robust speech recognition in real-world noisy environments
Kim, DS
Lee, SY
Kil, RM
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (01): : 55 - 69
[24] AMPLITUDE MODULATION SPECTROGRAM BASED FEATURES FOR ROBUST SPEECH RECOGNITION IN NOISY AND REVERBERANT ENVIRONMENTS
Moritz, Niko
Anemueller, Joern
Kollmeier, Birger
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5492 - 5495
[25] Blind source extraction for robust speech recognition in multisource noisy environments
Nesta, Francesco
Matassoni, Marco
COMPUTER SPEECH AND LANGUAGE, 2013, 27 (03): : 703 - 725
[26] ROBUST SPEECH RECOGNITION UNDER NOISY ENVIRONMENTS USING ASYMMETRIC TAPERS
Alam, Md Jahangir
Kenny, Patrick
O'Shaughnessy, Douglas
2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 1638 - 1642
[27] Cluster-based Polynomial-Fit Histogram Equalization (CPHEQ) for Robust Speech Recognition
Lin, Shih-Hsiang
Yeh, Yao-Ming
Chen, Berlin
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 197 - 200
[28] Noisy speech recognition based on speech enhancement
Wang, Xia
Tang, Hongmei
Zhao, Xiaoqun
SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 3, PROCEEDINGS, 2007, : 713 - +
[29] A robust speech enhancement method in noisy environments
Abajaddi, Nesrine
Mounir, Badia
Elfahm, Youssef
Farchi, Abdelmajid
INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2023, 14 (09) : 973 - 983
[30] SPEECH RECOGNITION WITH NO SPEECH OR WITH NOISY SPEECH
Krishna, Gautam
Co Tran
Yu, Jianguo
Tewfik, Ahmed H.
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 1090 - 1094

← 1 2 3 4 5 →