Blind model selection for automatic speech recognition in reverberant environments

被引:13
|
作者
Couvreur, L
Couvreur, C
机构
[1] Fac Polytech Mons, Multitel TCTS, B-7000 Mons, Belgium
[2] Scansoft Inc, Speech & Language Technol Div, B-9820 Merelbeke, Belgium
关键词
room reverberation; maximum likelihood estimation; automatic speech recognition;
D O I
10.1023/B:VLSI.0000015096.78139.82
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This communication presents a new method for automatic speech recognition in reverberant environments. Our approach consists in the selection of the best acoustic model out of a library of models trained on artificially reverberated speech databases corresponding to various reverberant conditions. Given a speech utterance recorded within a reverberant room, a Maximum Likelihood estimate of the fullband room reverberation time is computed using a statistical model for short-term log-energy sequences of anechoic speech. The estimated reverberation time is then used to select the best acoustic model, i.e., the model trained on a speech database most closely matching the estimated reverberation time, which serves to recognize the reverberated speech utterance. The proposed model selection approach is shown to improve significantly recognition accuracy for a connected digit task in both simulated and real reverberant environments, outperforming standard channel normalization techniques.
引用
收藏
页码:189 / 203
页数:15
相关论文
共 50 条
  • [1] Blind Model Selection for Automatic Speech Recognition in Reverberant Environments
    Laurent Couvreur
    Christophe Couvreur
    [J]. Journal of VLSI signal processing systems for signal, image and video technology, 2004, 36 : 189 - 203
  • [2] Model based feature enhancement for automatic speech recognition in reverberant environments
    Krueger, Alexander
    Haeb-Umbach, Reinhold
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1239 - 1242
  • [3] Robust automatic speech recognition based on neural network in reverberant environments
    Bai, L.
    Li, H. L.
    He, Y. Y.
    [J]. CIVIL, ARCHITECTURE AND ENVIRONMENTAL ENGINEERING, VOLS 1 AND 2, 2017, : 1319 - 1324
  • [4] Subband Temporal Modulation Spectrum Normalization for Automatic Speech Recognition in Reverberant Environments
    Lu, Xugang
    Unoki, Masashi
    Nakamura, Satoshi
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2475 - 2478
  • [5] Distant Speaker Recognition Based on the Automatic Selection of Reverberant Environments Using GMMs
    Wang, Longbiao
    Kishi, Yoshiki
    Kai, Atsuhiko
    [J]. PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, : 954 - 958
  • [6] Speech Emotion Recognition in Noisy and Reverberant Environments
    Heracleous, Panikos
    Yasuda, Keiji
    Sugaya, Fumiaki
    Yoneyama, Akio
    Hashimoto, Masayuki
    [J]. 2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2017, : 262 - 266
  • [7] Strategies for distant speech recognition in reverberant environments
    Delcroix, Marc
    Yoshioka, Takuya
    Ogawa, Atsunori
    Kubo, Yotaro
    Fujimoto, Masakiyo
    Ito, Nobutaka
    Kinoshita, Keisuke
    Espi, Miquel
    Araki, Shoko
    Hori, Takaaki
    Nakatani, Tomohiro
    [J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2015,
  • [8] Survey on Approaches to Speech Recognition in Reverberant Environments
    Yoshioka, Takuya
    Sehr, Armin
    Delcroix, Marc
    Kinoshita, Keisuke
    Maas, Roland
    Nakatani, Tomohiro
    Kellermann, Walter
    [J]. 2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [9] THE AUTOMATIC SPEECH RECOGITION IN REVERBERANT ENVIRONMENTS (ASpIRE) CHALLENGE
    Harper, Mary
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 547 - 554
  • [10] Optimal Automatic Speech Recognition System Selection for Noisy Environments
    Tachioka, Yuuki
    Narita, Tomohiro
    [J]. 2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,