Front-end for Far-field Speech Recognition based on Frequency Domain Linear Prediction

被引：0

作者：

Ganapathy, Sriram ^{[1
,2
]}

Thomas, Samuel ^{[1
,2
]}

Hermansky, Hynek ^{[1
,2
]}

机构：

[1] IDIAP Res Inst, Martigny, Switzerland

[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland

来源：

INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 | 2008年

关键词：

Frequency Domain Linear Prediction; Front-end for Far-field Speech; Reverberant Speech; Speech Recognition;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Automatic Speech Recognition (ASR) systems usually fail when they encounter speech from far-field microphone in reverberant environments. This is due to the application of short-term feature extraction techniques which do not compensate for the artifacts introduced by long room impulse responses. In this paper, we propose a front-end, based on Frequency Domain Linear Prediction (FDLP), that tries to remove reverberation artifacts present in far-field speech. Long temporal segments of far-field speech are analyzed in narrow frequency sub-bands to extract FDLP envelopes and residual signals. Filtering the residual signals with gain normalized inverse FDLP filters result in a set of sub-band signals which are synthesized to reconstruct the signal back. ASR experiments on far-field speech data processed by the proposed front-end show significant improvements (relative reduction of 30% in word error rate) compared to other robust feature extraction techniques.

引用

页码：984 / +

页数：2

共 50 条

[21] A Front-End Technique for Automatic Noisy Speech Recognition
Naing, Hay Mar Soe
Hidayat, Risanuri
Hartanto, Rudy
Miyanaga, Yoshikazu
[J]. PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 49 - 54
[22] Automatic Speech Recognition with a Cochlear Implant Front-End
Nogueira, Waldo
Harczos, Tamas
Edler, Bernd
Ostermann, Joern
Buechner, Andreas
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1993 - +
[23] Recognition of Reverberant Speech Using Frequency Domain Linear Prediction
Thomas, Samuel
Ganapathy, Sriram
Hermansky, Hynek
[J]. IEEE SIGNAL PROCESSING LETTERS, 2008, 15 : 681 - 684
[24] A Front-End Speech Enhancement System for Robust Automotive Speech Recognition
Wang, Haikun
Ye, Zhongfu
Chen, Jingdong
[J]. 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 1 - 5
[25] Investigation of Speech Separation as a Front-End for Noise Robust Speech Recognition
Narayanan, Arun
Wang, DeLiang
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (04) : 826 - 835
[26] Dereverberation of autoregressive envelopes for far-field speech recognition
Purushothaman, Anurenjan
Sreeram, Anirudh
Kumar, Rohit
Ganapathy, Sriram
[J]. COMPUTER SPEECH AND LANGUAGE, 2022, 72
[27] Investigation into a Mel subspace based front-end processing for robust speech recognition
Selouani, SA
O'Shaughnessy, D
[J]. Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004, : 187 - 190
[28] Feature enhancement for a bitstream-based front-end in wireless speech recognition
Kim, HK
Cox, RV
[J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 241 - 244
[29] Performance evaluation of front-end algorithms for robust speech recognition
Cheng, O
Abdulla, W
Salcic, Z
[J]. ISSPA 2005: THE 8TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1 AND 2, PROCEEDINGS, 2005, : 711 - 714
[30] Thin client front-end processor for distributed speech recognition
Chow, KF
Liew, SC
Lua, KT
[J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 29 - 32

← 1 2 3 4 5 →