MODEL-BASED DEREVERBERATION IN THE LOGMELSPEC DOMAIN FOR ROBUST DISTANT-TALKING SPEECH RECOGNITION

被引：3

作者：

Sehr, Armin ^{[1
]}

Maas, Roland ^{[1
]}

Kellermann, Walter ^{[1
]}

机构：

[1] Univ Erlangen Nurnberg, D-91058 Erlangen, Germany

来源：

2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2010年

关键词：

Reverberation; model-based dereverberation; acoustic modeling; distant-talking ASR; robust ASR;

D O I：

10.1109/ICASSP.2010.5495671

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The REMOS (REverberation MOdeling for Speech recognition) concept for reverberation-robust distant-talking speech recognition, introduced in [1] for melspectral features, is extended in this contribution to logarithmic melspectral (logmelspec) features. Based on a combined acoustic model consisting of a hidden Markov model network and a reverberation model, REMOS determines clean-speech and reverberation estimates during recognition by an inner optimization operation. A reformulation of this inner optimization problem for logmelspec features, allowing an efficient solution by nonlinear optimization algorithms, is derived in this paper so that an efficient implementation of REMOS for logmelspec features becomes possible. Connected digit recognition experiments show that the proposed REMOS implementation significantly outperforms reverberantly-trained HMMs in highly reverberant environments.

引用

页码：4298 / 4301

页数：4

共 50 条

[1] Reverberation Model-Based Decoding in the Logmelspec Domain for Robust Distant-Talking Speech Recognition
Sehr, Armin
Maas, Roland
Kellermann, Walter
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (07): : 1676 - 1691
[2] Robust distant-talking speech recognition
Lin, Q
Che, C
Yuk, DS
Jin, L
deVries, B
Pearson, J
Flanagan, J
[J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 21 - 24
[3] JOINT SPARSE REPRESENTATION BASED CEPSTRAL-DOMAIN DEREVERBERATION FOR DISTANT-TALKING SPEECH RECOGNITION
Li, Weifeng
Wang, Longbiao
Zhou, Fei
Liao, Qingmin
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7117 - 7120
[4] Combination of bottleneck feature extraction and dereverberation for distant-talking speech recognition
Bo Ren
Longbiao Wang
Liang Lu
Yuma Ueda
Atsuhiko Kai
[J]. Multimedia Tools and Applications, 2016, 75 : 5093 - 5108
[5] Combination of bottleneck feature extraction and dereverberation for distant-talking speech recognition
Ren, Bo
Wang, Longbiao
Lu, Liang
Ueda, Yuma
Kai, Atsuhiko
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (09) : 5093 - 5108
[6] Distant-talking Continuous Speech Recognition based on a novel Reverberation Model in the Feature Domain
Sehr, Armin
Zeller, Marcus
Kellermann, Walter
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 769 - 772
[7] A new concept for feature-domain dereverberation for robust distant-talking ASR
Sehr, Armin
Kellermann, Walter
[J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 369 - +
[8] Improved HMM separation for distant-talking speech recognition
Takiguchi, T
Nishimura, M
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (05): : 1127 - 1137
[9] Hidden Markov model training with contaminated speech material for distant-talking speech recognition
Matassoni, M
Omologo, M
Giuliani, D
Svaizer, P
[J]. COMPUTER SPEECH AND LANGUAGE, 2002, 16 (02): : 205 - 223
[10] Single-channel dereverberation for distant-talking speech recognition by combining denoising autoencoder and temporal structure normalization
Ueda, Yuma
Wang, Longbiao
Kai, Atsuhiko
Xiao, Xiong
Chng, Eng Siong
Li, Haizhou
[J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 379 - +

← 1 2 3 4 5 →