Optimization of Dereverberation Parameters based on Likelihood of Speech Recognizer

被引:0
|
作者
Gomez, Randy [1 ]
Kawahara, Tatsuya [1 ]
机构
[1] Kyoto Univ, ACCMS, Sakyo Ku, Kyoto 6068501, Japan
关键词
Dereverberation; Robust ASR;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech recognition under reverberant condition is a difficult task. Most dereverberation techniques used to address this problem enhance the reverberant waveform independent from that of the speech recognizer. In this paper, we improve the conventional Spectral Subtraction-based (SS) dereverberation technique. In our proposed approach, the dereverberation parameters are optimized to improve the likelihood of the acoustic model. The system is capable of adaptively fine-tuning these parameters jointly with acoustic model training. Additional optimization is also implemented during decoding of the test utterances. We have evaluated using real reverberant data and experimental results show that the proposed method significantly improves the recognition performance over the conventional approach.
引用
收藏
页码:1259 / 1262
页数:4
相关论文
共 50 条
  • [1] Speech recognizer based maximum likelihood beamforming
    Raj, B
    Seltzer, M
    Reyes-Gomez, MJ
    [J]. SPEECH SEPARATION BY HUMANS AND MACHINES, 2005, : 65 - 82
  • [2] Robust Speech Recognition Based on Dereverberation Parameter Optimization Using Acoustic Model Likelihood
    Gomez, Randy
    Kawahara, Tatsuya
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (07): : 1708 - 1716
  • [3] A Maximum Likelihood Approach to Deep Neural Network Based Speech Dereverberation
    Wang, Xin
    Du, Jun
    Wang, Yannan
    [J]. 2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 155 - 158
  • [4] MAXIMUM-LIKELIHOOD-BASED CEPSTRAL INVERSE FILTERING FOR BLIND SPEECH DEREVERBERATION
    Kumar, Kshitiz
    Stern, Richard M.
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4282 - 4285
  • [5] A speech recognizer with selectable model parameters
    Han, W
    Chan, CF
    Choy, CS
    Pun, KP
    [J]. 2005 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), VOLS 1-6, CONFERENCE PROCEEDINGS, 2005, : 5842 - 5845
  • [6] Syllable Based Continuous Speech Recognizer With Varied Length Maximum Likelihood Character Segmentation
    Ganesh, Akila A.
    Ravichandran, Chandra
    [J]. 2013 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2013, : 935 - 940
  • [7] Speech-recognizer-based filter optimization for microphone array processing
    Seltzer, ML
    Raj, B
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2003, 10 (03) : 69 - 71
  • [8] Speech Recognizer Optimization under Speed Constraints
    Bulyko, Ivan
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1497 - 1500
  • [9] SPEECH DEREVERBERATION BASED ON CONVEX OPTIMIZATION ALGORITHMS FOR GROUP SPARSE LINEAR PREDICTION
    Giacobello, Daniele
    Jensen, Tobias Lindstrom
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 446 - 450
  • [10] Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization
    Yoshioka, Takuya
    Nakatani, Tomohiro
    Miyoshi, Masato
    Okuno, Hiroshi G.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (01): : 69 - 84