Blind model selection for automatic speech recognition in reverberant environments

被引：13

作者：

Couvreur, L

Couvreur, C

机构：

[1] Fac Polytech Mons, Multitel TCTS, B-7000 Mons, Belgium

[2] Scansoft Inc, Speech & Language Technol Div, B-9820 Merelbeke, Belgium

来源：

JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY | 2004年 / 36卷 / 2-3期

关键词：

room reverberation; maximum likelihood estimation; automatic speech recognition;

D O I：

10.1023/B:VLSI.0000015096.78139.82

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This communication presents a new method for automatic speech recognition in reverberant environments. Our approach consists in the selection of the best acoustic model out of a library of models trained on artificially reverberated speech databases corresponding to various reverberant conditions. Given a speech utterance recorded within a reverberant room, a Maximum Likelihood estimate of the fullband room reverberation time is computed using a statistical model for short-term log-energy sequences of anechoic speech. The estimated reverberation time is then used to select the best acoustic model, i.e., the model trained on a speech database most closely matching the estimated reverberation time, which serves to recognize the reverberated speech utterance. The proposed model selection approach is shown to improve significantly recognition accuracy for a connected digit task in both simulated and real reverberant environments, outperforming standard channel normalization techniques.

引用

页码：189 / 203

页数：15

共 50 条

[1] Blind Model Selection for Automatic Speech Recognition in Reverberant Environments
Laurent Couvreur
Christophe Couvreur
[J]. Journal of VLSI signal processing systems for signal, image and video technology, 2004, 36 : 189 - 203
[2] Model based feature enhancement for automatic speech recognition in reverberant environments
Krueger, Alexander
Haeb-Umbach, Reinhold
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1239 - 1242
[3] Robust automatic speech recognition based on neural network in reverberant environments
Bai, L.
Li, H. L.
He, Y. Y.
[J]. CIVIL, ARCHITECTURE AND ENVIRONMENTAL ENGINEERING, VOLS 1 AND 2, 2017, : 1319 - 1324
[4] Subband Temporal Modulation Spectrum Normalization for Automatic Speech Recognition in Reverberant Environments
Lu, Xugang
Unoki, Masashi
Nakamura, Satoshi
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2475 - 2478
[5] Distant Speaker Recognition Based on the Automatic Selection of Reverberant Environments Using GMMs
Wang, Longbiao
Kishi, Yoshiki
Kai, Atsuhiko
[J]. PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, : 954 - 958
[6] Strategies for distant speech recognition in reverberant environments
Delcroix, Marc
Yoshioka, Takuya
Ogawa, Atsunori
Kubo, Yotaro
Fujimoto, Masakiyo
Ito, Nobutaka
Kinoshita, Keisuke
Espi, Miquel
Araki, Shoko
Hori, Takaaki
Nakatani, Tomohiro
[J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2015,
[7] Speech Emotion Recognition in Noisy and Reverberant Environments
Heracleous, Panikos
Yasuda, Keiji
Sugaya, Fumiaki
Yoneyama, Akio
Hashimoto, Masayuki
[J]. 2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2017, : 262 - 266
[8] Survey on Approaches to Speech Recognition in Reverberant Environments
Yoshioka, Takuya
Sehr, Armin
Delcroix, Marc
Kinoshita, Keisuke
Maas, Roland
Nakatani, Tomohiro
Kellermann, Walter
[J]. 2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
[9] THE AUTOMATIC SPEECH RECOGITION IN REVERBERANT ENVIRONMENTS (ASpIRE) CHALLENGE
Harper, Mary
[J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 547 - 554
[10] Optimal Automatic Speech Recognition System Selection for Noisy Environments
Tachioka, Yuuki
Narita, Tomohiro
[J]. 2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,

← 1 2 3 4 5 →