Front-end for Far-field Speech Recognition based on Frequency Domain Linear Prediction

被引:0
|
作者
Ganapathy, Sriram [1 ,2 ]
Thomas, Samuel [1 ,2 ]
Hermansky, Hynek [1 ,2 ]
机构
[1] IDIAP Res Inst, Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
关键词
Frequency Domain Linear Prediction; Front-end for Far-field Speech; Reverberant Speech; Speech Recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic Speech Recognition (ASR) systems usually fail when they encounter speech from far-field microphone in reverberant environments. This is due to the application of short-term feature extraction techniques which do not compensate for the artifacts introduced by long room impulse responses. In this paper, we propose a front-end, based on Frequency Domain Linear Prediction (FDLP), that tries to remove reverberation artifacts present in far-field speech. Long temporal segments of far-field speech are analyzed in narrow frequency sub-bands to extract FDLP envelopes and residual signals. Filtering the residual signals with gain normalized inverse FDLP filters result in a set of sub-band signals which are synthesized to reconstruct the signal back. ASR experiments on far-field speech data processed by the proposed front-end show significant improvements (relative reduction of 30% in word error rate) compared to other robust feature extraction techniques.
引用
收藏
页码:984 / +
页数:2
相关论文
共 50 条
  • [41] Enhanced Sparse Imputation Techniques for a Robust Speech Recognition Front-End
    Tan, Qun Feng
    Georgiou, Panayiotis G.
    Narayanan, Shrikanth
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (08): : 2418 - 2429
  • [42] An Efferent-Inspired Auditory Model Front-End for Speech Recognition
    Lee, Chia-ying
    Glass, James
    Ghitza, Oded
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 56 - +
  • [43] Advanced Front-end for Robust Speech Recognition in Extremely Adverse Environments
    Dimitriadis, Dimitrios
    Segura, Jose C.
    Garcia, Luz
    Potamianos, Alexandros
    Maragos, Petros
    Pitsikalis, Vassilis
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2221 - +
  • [44] Combined Software/hardware implementation of a filterbank front-end for speech recognition
    Mouchtaris, A
    Cao, Y
    Khan, S
    Van der Spiegel, J
    [J]. 2005 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS - DESIGN AND IMPLEMENTATION (SIPS), 2005, : 436 - 441
  • [45] Front-End Feature Compensation for Noise Robust Speech Emotion Recognition
    Pandharipande, Meghna
    Chakraborty, Rupayan
    Panda, Ashish
    Das, Biswajit
    Kopparapu, Sunil Kumar
    [J]. 2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [46] A Speaker-Dependent Approach to Separation of Far-Field Multi-Talker Microphone Array Speech for Front-End Processing in the CHiME-5 Challenge
    Sun, Lei
    Du, Jun
    Gao, Tian
    Fang, Yi
    Ma, Feng
    Lee, Chin-Hui
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2019, 13 (04) : 827 - 840
  • [47] Performance improvement of a bitstream-based front-end for wireless speech recognition in adverse environments
    Kim, HK
    Cox, RV
    Rose, RC
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (08): : 591 - 604
  • [48] Robust front-end for speech recognition based on computational auditory scene analysis and speaker model
    Guan, Yong
    Li, Peng
    Liu, Wen-Ju
    Xu, Bo
    [J]. Zidonghua Xuebao/ Acta Automatica Sinica, 2009, 35 (04): : 410 - 416
  • [49] A New Subband-Weighted MVDR-Based Front-End for Robust Speech Recognition
    Seyedin, Sanaz
    Ahadi, Seyed Mohammad
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (08): : 2252 - 2261
  • [50] A bitstream-based front-end for wireless speech recognition on IS-136 communications system
    Kim, HK
    Cox, RV
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (05): : 558 - 568