Strategies for distant speech recognition in reverberant environments

被引:46
|
作者
Delcroix, Marc [1 ]
Yoshioka, Takuya [1 ]
Ogawa, Atsunori [1 ]
Kubo, Yotaro [1 ]
Fujimoto, Masakiyo [1 ]
Ito, Nobutaka [1 ]
Kinoshita, Keisuke [1 ]
Espi, Miquel [1 ]
Araki, Shoko [1 ]
Hori, Takaaki [1 ]
Nakatani, Tomohiro [1 ]
机构
[1] NTT Corp, NTT Commun Sci Labs, Kyoto, Japan
关键词
Reverberant speech recognition; Robust speech recognition; REVERB challenge; Dereverberation; Noise reduction; Deep neural network; BLIND SEPARATION; DEREVERBERATION; NOISE; ADAPTATION; MIXTURES; FEATURES; DOMAIN;
D O I
10.1186/s13634-015-0245-7
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Reverberation and noise are known to severely affect the automatic speech recognition (ASR) performance of speech recorded by distant microphones. Therefore, we must deal with reverberation if we are to realize high-performance hands-free speech recognition. In this paper, we review a recognition system that we developed at our laboratory to deal with reverberant speech. The system consists of a speech enhancement (SE) front-end that employs long-term linear prediction-based dereverberation followed by noise reduction. We combine our SE front-end with an ASR back-end that uses neural networks for acoustic and language modeling. The proposed system achieved top scores on the ASR task of the REVERB challenge. This paper describes the different technologies used in our system and presents detailed experimental results that justify our implementation choices and may provide hints for designing distant ASR systems.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Strategies for distant speech recognitionin reverberant environments
    Marc Delcroix
    Takuya Yoshioka
    Atsunori Ogawa
    Yotaro Kubo
    Masakiyo Fujimoto
    Nobutaka Ito
    Keisuke Kinoshita
    Miquel Espi
    Shoko Araki
    Takaaki Hori
    Tomohiro Nakatani
    [J]. EURASIP Journal on Advances in Signal Processing, 2015
  • [2] CENSREC-4: An evaluation framework for distant-talking speech recognition in reverberant environments
    Fukumori, Takahiro
    Nishiura, Takanobu
    Nakayama, Masato
    Denda, Yuki
    Kitaoka, Norihide
    Yamada, Takeshi
    Yamamoto, Kazumasa
    Tsuge, Satoru
    Fujimoto, Masakiyo
    Takiguchi, Tetsuya
    Miyajima, Chiyomi
    Tamura, Satoshi
    Ogawa, Tetsuji
    Matsuda, Shigeki
    Kuroiwa, Shingo
    Takeda, Kazuya
    Nakamura, Satoshi
    [J]. ACOUSTICAL SCIENCE AND TECHNOLOGY, 2011, 32 (05) : 201 - 210
  • [3] Speech Emotion Recognition in Noisy and Reverberant Environments
    Heracleous, Panikos
    Yasuda, Keiji
    Sugaya, Fumiaki
    Yoneyama, Akio
    Hashimoto, Masayuki
    [J]. 2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2017, : 262 - 266
  • [4] Survey on Approaches to Speech Recognition in Reverberant Environments
    Yoshioka, Takuya
    Sehr, Armin
    Delcroix, Marc
    Kinoshita, Keisuke
    Maas, Roland
    Nakatani, Tomohiro
    Kellermann, Walter
    [J]. 2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [5] Robust Speaker Recognition from Distant Speech under Real Reverberant Environments Using Speaker Embeddings
    Nandwana, Mahesh Kumar
    van Hout, Julien
    McLaren, Mitchell
    Stauffer, Allen
    Richey, Colleen
    Lawson, Aaron
    Graciarena, Martin
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1106 - 1110
  • [6] Acoustic diversity for improved speech recognition in reverberant environments
    Gillespie, BW
    Atlas, LE
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 557 - 560
  • [7] Methods for Robust Speech Recognition in Reverberant Environments: A Comparison
    Petrick, Rico
    Feher, Thomas
    Unoki, Masashi
    Hoffmann, Ruediger
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 582 - +
  • [8] Speech Recognition in reverberant environments using remote microphones
    Brayda, Luca
    Wellekens, Christian
    Matassoni, Marco
    Omologo, Maurizio
    [J]. ISM 2006: EIGHTH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, PROCEEDINGS, 2006, : 584 - 591
  • [9] Speech recognition in multisource reverberant environments with binaural inputs
    Roman, Nicoleta
    Srinivasan, Soundararajan
    Wang, DeLiang
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 309 - 312
  • [10] Speech detection and enhancement using single microphone for distant speech applications in reverberant environments
    Kothapally, Vinay
    Hansen, John H. L.
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1948 - 1952