Strategies for distant speech recognition in reverberant environments

被引：46

作者：

Delcroix, Marc ^{[1
]}

Yoshioka, Takuya ^{[1
]}

Ogawa, Atsunori ^{[1
]}

Kubo, Yotaro ^{[1
]}

Fujimoto, Masakiyo ^{[1
]}

Ito, Nobutaka ^{[1
]}

Kinoshita, Keisuke ^{[1
]}

Espi, Miquel ^{[1
]}

Araki, Shoko ^{[1
]}

Hori, Takaaki ^{[1
]}

Nakatani, Tomohiro ^{[1
]}

机构：

[1] NTT Corp, NTT Commun Sci Labs, Kyoto, Japan

来源：

EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING | 2015年

关键词：

Reverberant speech recognition; Robust speech recognition; REVERB challenge; Dereverberation; Noise reduction; Deep neural network; BLIND SEPARATION; DEREVERBERATION; NOISE; ADAPTATION; MIXTURES; FEATURES; DOMAIN;

D O I：

10.1186/s13634-015-0245-7

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Reverberation and noise are known to severely affect the automatic speech recognition (ASR) performance of speech recorded by distant microphones. Therefore, we must deal with reverberation if we are to realize high-performance hands-free speech recognition. In this paper, we review a recognition system that we developed at our laboratory to deal with reverberant speech. The system consists of a speech enhancement (SE) front-end that employs long-term linear prediction-based dereverberation followed by noise reduction. We combine our SE front-end with an ASR back-end that uses neural networks for acoustic and language modeling. The proposed system achieved top scores on the ASR task of the REVERB challenge. This paper describes the different technologies used in our system and presents detailed experimental results that justify our implementation choices and may provide hints for designing distant ASR systems.

引用

页数：15

共 50 条

[1] Strategies for distant speech recognitionin reverberant environments
Marc Delcroix
Takuya Yoshioka
Atsunori Ogawa
Yotaro Kubo
Masakiyo Fujimoto
Nobutaka Ito
Keisuke Kinoshita
Miquel Espi
Shoko Araki
Takaaki Hori
Tomohiro Nakatani
[J]. EURASIP Journal on Advances in Signal Processing, 2015
[2] CENSREC-4: An evaluation framework for distant-talking speech recognition in reverberant environments
Fukumori, Takahiro
Nishiura, Takanobu
Nakayama, Masato
Denda, Yuki
Kitaoka, Norihide
Yamada, Takeshi
Yamamoto, Kazumasa
Tsuge, Satoru
Fujimoto, Masakiyo
Takiguchi, Tetsuya
Miyajima, Chiyomi
Tamura, Satoshi
Ogawa, Tetsuji
Matsuda, Shigeki
Kuroiwa, Shingo
Takeda, Kazuya
Nakamura, Satoshi
[J]. ACOUSTICAL SCIENCE AND TECHNOLOGY, 2011, 32 (05) : 201 - 210
[3] Speech Emotion Recognition in Noisy and Reverberant Environments
Heracleous, Panikos
Yasuda, Keiji
Sugaya, Fumiaki
Yoneyama, Akio
Hashimoto, Masayuki
[J]. 2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2017, : 262 - 266
[4] Survey on Approaches to Speech Recognition in Reverberant Environments
Yoshioka, Takuya
Sehr, Armin
Delcroix, Marc
Kinoshita, Keisuke
Maas, Roland
Nakatani, Tomohiro
Kellermann, Walter
[J]. 2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
[5] Robust Speaker Recognition from Distant Speech under Real Reverberant Environments Using Speaker Embeddings
Nandwana, Mahesh Kumar
van Hout, Julien
McLaren, Mitchell
Stauffer, Allen
Richey, Colleen
Lawson, Aaron
Graciarena, Martin
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1106 - 1110
[6] Acoustic diversity for improved speech recognition in reverberant environments
Gillespie, BW
Atlas, LE
[J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 557 - 560
[7] Methods for Robust Speech Recognition in Reverberant Environments: A Comparison
Petrick, Rico
Feher, Thomas
Unoki, Masashi
Hoffmann, Ruediger
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 582 - +
[8] Speech Recognition in reverberant environments using remote microphones
Brayda, Luca
Wellekens, Christian
Matassoni, Marco
Omologo, Maurizio
[J]. ISM 2006: EIGHTH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, PROCEEDINGS, 2006, : 584 - 591
[9] Speech recognition in multisource reverberant environments with binaural inputs
Roman, Nicoleta
Srinivasan, Soundararajan
Wang, DeLiang
[J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 309 - 312
[10] Speech detection and enhancement using single microphone for distant speech applications in reverberant environments
Kothapally, Vinay
Hansen, John H. L.
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1948 - 1952

← 1 2 3 4 5 →