Noise-robust speech feature processing with empirical mode decomposition

被引：3

作者：

Wu, Kuo-Hau ^{[1
]}

Chen, Chia-Ping ^{[1
]}

Yeh, Bing-Feng ^{[1
]}

机构：

[1] Natl Sun Yat Sen Univ, Dept Comp Sci & Engn, Kaohsiung 800, Taiwan

来源：

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING | 2011年

关键词：

Speech Signal; Empirical Mode Decomposition; Automatic Speech Recognition; Intrinsic Mode Function; Lower Envelope;

D O I：

10.1186/1687-4722-2011-9

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this article, a novel technique based on the empirical mode decomposition methodology for processing speech features is proposed and investigated. The empirical mode decomposition generalizes the Fourier analysis. It decomposes a signal as the sum of intrinsic mode functions. In this study, we implement an iterative algorithm to find the intrinsic mode functions for any given signal. We design a novel speech feature post-processing method based on the extracted intrinsic mode functions to achieve noise-robustness for automatic speech recognition. Evaluation results on the noisy-digit Aurora 2.0 database show that our method leads to significant performance improvement. The relative improvement over the baseline features increases from 24.0 to 41.1% when the proposed post-processing method is applied on mean-variance normalized speech features. The proposed method also improves over the performance achieved by a very noise-robust frontend when the test speech data are highly mismatched.

引用

页码：1 / 9

页数：9

共 50 条

[41] INCORPORATING MASK MODELLING FOR NOISE-ROBUST AUTOMATIC SPEECH RECOGNITION
Koekueer, Muenevver
Jancovic, Peter
[J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3929 - 3932
[42] Deep Maxout Networks Applied to Noise-Robust Speech Recognition
de-la-Calle-Silos, F.
Gallardo-Antolin, A.
Pelaez-Moreno, C.
[J]. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, IBERSPEECH 2014, 2014, 8854 : 109 - 118
[43] Modeling human auditory perception for noise-robust speech recognition
Lee, SY
[J]. PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : PL72 - PL74
[44] A noise-robust speech input interface for information kiosk terminals
Ida, M
Mori, H
Nakamura, S
Shikano, K
[J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2004, 87 (12): : 51 - 61
[45] An improved algorithm for noise-robust sparse linear prediction of speech
ZHOU Bin
ZOU Xia
ZHANG Xiongwei
[J]. Chinese Journal of Acoustics, 2015, 34 (01) : 84 - 95
[46] Unsupervised modulation filter learning for noise-robust speech recognition
Agrawal, Purvi
Ganapathy, Sriram
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 142 (03): : 1686 - 1692
[47] Noise-robust speech analysis using system identification methods
Arima, Y
Shimamura, T
[J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 2003, 86 (03): : 20 - 32
[48] A companding front end for noise-robust automatic speech recognition
Guinness, J
Raj, B
Schmidt-Nielsen, B
Turicchia, L
Sarpeshkar, R
[J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 249 - 252
[49] Unsupervised noise-robust feature extraction for aerial image classification
LIANG Ye
LU Shuai
WENG Rui
HAN ChengZhe
LIU Ming
[J]. Science China Technological Sciences, 2020, (08) : 1406 - 1415
[50] Suppression of Residual Noise From Speech Signals Using Empirical Mode Decomposition
Hasan, Taufiq
Hasan, Md. Kamrul
[J]. IEEE SIGNAL PROCESSING LETTERS, 2009, 16 (1-3) : 2 - 5

← 1 2 3 4 5 →