Front-end for Far-field Speech Recognition based on Frequency Domain Linear Prediction

被引：0

作者：

Ganapathy, Sriram ^{[1
,2
]}

Thomas, Samuel ^{[1
,2
]}

Hermansky, Hynek ^{[1
,2
]}

机构：

[1] IDIAP Res Inst, Martigny, Switzerland

[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland

来源：

INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 | 2008年

关键词：

Frequency Domain Linear Prediction; Front-end for Far-field Speech; Reverberant Speech; Speech Recognition;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Automatic Speech Recognition (ASR) systems usually fail when they encounter speech from far-field microphone in reverberant environments. This is due to the application of short-term feature extraction techniques which do not compensate for the artifacts introduced by long room impulse responses. In this paper, we propose a front-end, based on Frequency Domain Linear Prediction (FDLP), that tries to remove reverberation artifacts present in far-field speech. Long temporal segments of far-field speech are analyzed in narrow frequency sub-bands to extract FDLP envelopes and residual signals. Filtering the residual signals with gain normalized inverse FDLP filters result in a set of sub-band signals which are synthesized to reconstruct the signal back. ASR experiments on far-field speech data processed by the proposed front-end show significant improvements (relative reduction of 30% in word error rate) compared to other robust feature extraction techniques.

引用

页码：984 / +

页数：2

共 50 条

[41] Enhanced Sparse Imputation Techniques for a Robust Speech Recognition Front-End
Tan, Qun Feng
Georgiou, Panayiotis G.
Narayanan, Shrikanth
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (08): : 2418 - 2429
[42] An Efferent-Inspired Auditory Model Front-End for Speech Recognition
Lee, Chia-ying
Glass, James
Ghitza, Oded
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 56 - +
[43] Advanced Front-end for Robust Speech Recognition in Extremely Adverse Environments
Dimitriadis, Dimitrios
Segura, Jose C.
Garcia, Luz
Potamianos, Alexandros
Maragos, Petros
Pitsikalis, Vassilis
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2221 - +
[44] Combined Software/hardware implementation of a filterbank front-end for speech recognition
Mouchtaris, A
Cao, Y
Khan, S
Van der Spiegel, J
[J]. 2005 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS - DESIGN AND IMPLEMENTATION (SIPS), 2005, : 436 - 441
[45] Front-End Feature Compensation for Noise Robust Speech Emotion Recognition
Pandharipande, Meghna
Chakraborty, Rupayan
Panda, Ashish
Das, Biswajit
Kopparapu, Sunil Kumar
[J]. 2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
[46] A Speaker-Dependent Approach to Separation of Far-Field Multi-Talker Microphone Array Speech for Front-End Processing in the CHiME-5 Challenge
Sun, Lei
Du, Jun
Gao, Tian
Fang, Yi
Ma, Feng
Lee, Chin-Hui
[J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2019, 13 (04) : 827 - 840
[47] Performance improvement of a bitstream-based front-end for wireless speech recognition in adverse environments
Kim, HK
Cox, RV
Rose, RC
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (08): : 591 - 604
[48] Robust front-end for speech recognition based on computational auditory scene analysis and speaker model
Guan, Yong
Li, Peng
Liu, Wen-Ju
Xu, Bo
[J]. Zidonghua Xuebao/ Acta Automatica Sinica, 2009, 35 (04): : 410 - 416
[49] A New Subband-Weighted MVDR-Based Front-End for Robust Speech Recognition
Seyedin, Sanaz
Ahadi, Seyed Mohammad
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (08): : 2252 - 2261
[50] A bitstream-based front-end for wireless speech recognition on IS-136 communications system
Kim, HK
Cox, RV
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (05): : 558 - 568

← 1 2 3 4 5 →