Front-end for Far-field Speech Recognition based on Frequency Domain Linear Prediction

被引：0

作者：

Ganapathy, Sriram ^{[1
,2
]}

Thomas, Samuel ^{[1
,2
]}

Hermansky, Hynek ^{[1
,2
]}

机构：

[1] IDIAP Res Inst, Martigny, Switzerland

[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland

来源：

INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 | 2008年

关键词：

Frequency Domain Linear Prediction; Front-end for Far-field Speech; Reverberant Speech; Speech Recognition;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Automatic Speech Recognition (ASR) systems usually fail when they encounter speech from far-field microphone in reverberant environments. This is due to the application of short-term feature extraction techniques which do not compensate for the artifacts introduced by long room impulse responses. In this paper, we propose a front-end, based on Frequency Domain Linear Prediction (FDLP), that tries to remove reverberation artifacts present in far-field speech. Long temporal segments of far-field speech are analyzed in narrow frequency sub-bands to extract FDLP envelopes and residual signals. Filtering the residual signals with gain normalized inverse FDLP filters result in a set of sub-band signals which are synthesized to reconstruct the signal back. ASR experiments on far-field speech data processed by the proposed front-end show significant improvements (relative reduction of 30% in word error rate) compared to other robust feature extraction techniques.

引用

页码：984 / +

页数：2

共 50 条

[1] MULTICHANNEL AUDIO FRONT-END FOR FAR-FIELD AUTOMATIC SPEECH RECOGNITION
Chhetri, Amit
Hilmes, Philip
Kristjansson, Trausti
Chu, Wai
Mansour, Mohamed
Li, Xiaoxue
Zhang, Xianxian
[J]. 2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1527 - 1531
[2] Task-Specific Optimization of Virtual Channel Linear Prediction-Based Speech Dereverberation Front-End for Far-Field Speaker Verification
Yang, Joon-Young
Chang, Joon-Hyuk
[J]. IEEE/ACM Transactions on Audio Speech and Language Processing, 2022, 30 : 3144 - 3159
[3] Task-Specific Optimization of Virtual Channel Linear Prediction-Based Speech Dereverberation Front-End for Far-Field Speaker Verification
Yang, Joon-Young
Chang, Joon-Hyuk
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 3144 - 3159
[4] A Reassigned Front-End for Speech Recognition
Tryfou, Georgina
Omologo, Maurizio
[J]. 2017 25TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2017, : 553 - 557
[5] Speech Recognition with Frequency Domain Linear Prediction
Harshita, P.
Adiga, Akshay R.
[J]. PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), 2018, : 630 - 634
[6] Curriculum Learning based approaches for robust end-to-end far-field speech recognition
Ranjan, Shivesh
Hansen, John H. L.
[J]. SPEECH COMMUNICATION, 2021, 132 : 123 - 131
[7] End-to-End Far-Field Speech Recognition with Unified Dereverberation and Beamforming
Zhang, Wangyou
Subramanian, Aswin Shanmugam
Chang, Xuankai
Watanabe, Shinji
Qian, Yanmin
[J]. INTERSPEECH 2020, 2020, : 324 - 328
[8] Far-Field Automatic Speech Recognition
Haeb-Umbach, Reinhold
Heymann, Jahn
Drude, Lukas
Watanabe, Shinji
Delcroix, Marc
Nakatani, Tomohiro
[J]. PROCEEDINGS OF THE IEEE, 2021, 109 (02) : 124 - 148
[9] The speech recognition based on the bark wavelet front-end processing
Zhang, XY
Jiao, ZP
Zhao, ZF
[J]. FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 2, PROCEEDINGS, 2005, 3614 : 302 - 305
[10] Wavelet-based Front-End for Electromyographic Speech Recognition
Wand, Michael
Jou, Szu-Chen Stan
Schultz, Tanja
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1773 - +

← 1 2 3 4 5 →