MODEL-BASED DEREVERBERATION IN THE LOGMELSPEC DOMAIN FOR ROBUST DISTANT-TALKING SPEECH RECOGNITION

被引:3
|
作者
Sehr, Armin [1 ]
Maas, Roland [1 ]
Kellermann, Walter [1 ]
机构
[1] Univ Erlangen Nurnberg, D-91058 Erlangen, Germany
关键词
Reverberation; model-based dereverberation; acoustic modeling; distant-talking ASR; robust ASR;
D O I
10.1109/ICASSP.2010.5495671
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The REMOS (REverberation MOdeling for Speech recognition) concept for reverberation-robust distant-talking speech recognition, introduced in [1] for melspectral features, is extended in this contribution to logarithmic melspectral (logmelspec) features. Based on a combined acoustic model consisting of a hidden Markov model network and a reverberation model, REMOS determines clean-speech and reverberation estimates during recognition by an inner optimization operation. A reformulation of this inner optimization problem for logmelspec features, allowing an efficient solution by nonlinear optimization algorithms, is derived in this paper so that an efficient implementation of REMOS for logmelspec features becomes possible. Connected digit recognition experiments show that the proposed REMOS implementation significantly outperforms reverberantly-trained HMMs in highly reverberant environments.
引用
收藏
页码:4298 / 4301
页数:4
相关论文
共 50 条
  • [1] Reverberation Model-Based Decoding in the Logmelspec Domain for Robust Distant-Talking Speech Recognition
    Sehr, Armin
    Maas, Roland
    Kellermann, Walter
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (07): : 1676 - 1691
  • [2] Robust distant-talking speech recognition
    Lin, Q
    Che, C
    Yuk, DS
    Jin, L
    deVries, B
    Pearson, J
    Flanagan, J
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 21 - 24
  • [3] JOINT SPARSE REPRESENTATION BASED CEPSTRAL-DOMAIN DEREVERBERATION FOR DISTANT-TALKING SPEECH RECOGNITION
    Li, Weifeng
    Wang, Longbiao
    Zhou, Fei
    Liao, Qingmin
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7117 - 7120
  • [4] Combination of bottleneck feature extraction and dereverberation for distant-talking speech recognition
    Bo Ren
    Longbiao Wang
    Liang Lu
    Yuma Ueda
    Atsuhiko Kai
    [J]. Multimedia Tools and Applications, 2016, 75 : 5093 - 5108
  • [5] Combination of bottleneck feature extraction and dereverberation for distant-talking speech recognition
    Ren, Bo
    Wang, Longbiao
    Lu, Liang
    Ueda, Yuma
    Kai, Atsuhiko
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (09) : 5093 - 5108
  • [6] Distant-talking Continuous Speech Recognition based on a novel Reverberation Model in the Feature Domain
    Sehr, Armin
    Zeller, Marcus
    Kellermann, Walter
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 769 - 772
  • [7] A new concept for feature-domain dereverberation for robust distant-talking ASR
    Sehr, Armin
    Kellermann, Walter
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 369 - +
  • [8] Improved HMM separation for distant-talking speech recognition
    Takiguchi, T
    Nishimura, M
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (05): : 1127 - 1137
  • [9] Hidden Markov model training with contaminated speech material for distant-talking speech recognition
    Matassoni, M
    Omologo, M
    Giuliani, D
    Svaizer, P
    [J]. COMPUTER SPEECH AND LANGUAGE, 2002, 16 (02): : 205 - 223
  • [10] Single-channel dereverberation for distant-talking speech recognition by combining denoising autoencoder and temporal structure normalization
    Ueda, Yuma
    Wang, Longbiao
    Kai, Atsuhiko
    Xiao, Xiong
    Chng, Eng Siong
    Li, Haizhou
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 379 - +