Front-end for Far-field Speech Recognition based on Frequency Domain Linear Prediction

被引:0
|
作者
Ganapathy, Sriram [1 ,2 ]
Thomas, Samuel [1 ,2 ]
Hermansky, Hynek [1 ,2 ]
机构
[1] IDIAP Res Inst, Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
关键词
Frequency Domain Linear Prediction; Front-end for Far-field Speech; Reverberant Speech; Speech Recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic Speech Recognition (ASR) systems usually fail when they encounter speech from far-field microphone in reverberant environments. This is due to the application of short-term feature extraction techniques which do not compensate for the artifacts introduced by long room impulse responses. In this paper, we propose a front-end, based on Frequency Domain Linear Prediction (FDLP), that tries to remove reverberation artifacts present in far-field speech. Long temporal segments of far-field speech are analyzed in narrow frequency sub-bands to extract FDLP envelopes and residual signals. Filtering the residual signals with gain normalized inverse FDLP filters result in a set of sub-band signals which are synthesized to reconstruct the signal back. ASR experiments on far-field speech data processed by the proposed front-end show significant improvements (relative reduction of 30% in word error rate) compared to other robust feature extraction techniques.
引用
收藏
页码:984 / +
页数:2
相关论文
共 50 条
  • [1] MULTICHANNEL AUDIO FRONT-END FOR FAR-FIELD AUTOMATIC SPEECH RECOGNITION
    Chhetri, Amit
    Hilmes, Philip
    Kristjansson, Trausti
    Chu, Wai
    Mansour, Mohamed
    Li, Xiaoxue
    Zhang, Xianxian
    [J]. 2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1527 - 1531
  • [2] Task-Specific Optimization of Virtual Channel Linear Prediction-Based Speech Dereverberation Front-End for Far-Field Speaker Verification
    Yang, Joon-Young
    Chang, Joon-Hyuk
    [J]. IEEE/ACM Transactions on Audio Speech and Language Processing, 2022, 30 : 3144 - 3159
  • [3] Task-Specific Optimization of Virtual Channel Linear Prediction-Based Speech Dereverberation Front-End for Far-Field Speaker Verification
    Yang, Joon-Young
    Chang, Joon-Hyuk
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 3144 - 3159
  • [4] A Reassigned Front-End for Speech Recognition
    Tryfou, Georgina
    Omologo, Maurizio
    [J]. 2017 25TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2017, : 553 - 557
  • [5] Speech Recognition with Frequency Domain Linear Prediction
    Harshita, P.
    Adiga, Akshay R.
    [J]. PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), 2018, : 630 - 634
  • [6] Curriculum Learning based approaches for robust end-to-end far-field speech recognition
    Ranjan, Shivesh
    Hansen, John H. L.
    [J]. SPEECH COMMUNICATION, 2021, 132 : 123 - 131
  • [7] End-to-End Far-Field Speech Recognition with Unified Dereverberation and Beamforming
    Zhang, Wangyou
    Subramanian, Aswin Shanmugam
    Chang, Xuankai
    Watanabe, Shinji
    Qian, Yanmin
    [J]. INTERSPEECH 2020, 2020, : 324 - 328
  • [8] Far-Field Automatic Speech Recognition
    Haeb-Umbach, Reinhold
    Heymann, Jahn
    Drude, Lukas
    Watanabe, Shinji
    Delcroix, Marc
    Nakatani, Tomohiro
    [J]. PROCEEDINGS OF THE IEEE, 2021, 109 (02) : 124 - 148
  • [9] The speech recognition based on the bark wavelet front-end processing
    Zhang, XY
    Jiao, ZP
    Zhao, ZF
    [J]. FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 2, PROCEEDINGS, 2005, 3614 : 302 - 305
  • [10] Wavelet-based Front-End for Electromyographic Speech Recognition
    Wand, Michael
    Jou, Szu-Chen Stan
    Schultz, Tanja
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1773 - +