ROBUST SPEECH RECOGNITION IN UNKNOWN REVERBERANT AND NOISY CONDITIONS

被引:0
|
作者
Hsiao, Roger [1 ]
Ma, Jeff [1 ]
Hartmann, William [1 ]
Karafiat, Martin [2 ]
Grezl, Frantisek [2 ]
Burget, Lukas [2 ]
Szoke, Igor [2 ]
Cernocky, Jan Honza [2 ]
Watanabe, Shinji [3 ]
Chen, Zhuo [3 ]
Mallidi, Sri Harish [4 ]
Hermansky, Hynek [4 ]
Tsakalidis, Stavros [1 ]
Schwartz, Richard [1 ]
机构
[1] Raytheon BBN Technol, Cambridge, MA 02138 USA
[2] Brno Univ Technol, Speech FIT & Ctr Excellence IT4I, CS-61090 Brno, Czech Republic
[3] Mitsubishi Elect Res Labs, Cambridge, MA USA
[4] Johns Hopkins Univ, Baltimore, MD USA
关键词
ASpIRE challenge; robust speech recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we describe our work on the ASpIRE (Automatic Speech recognition In Reverberant Environments) challenge, which aims to assess the robustness of automatic speech recognition (ASR) systems. The main characteristic of the challenge is developing a high-performance system without access to matched training and development data. While the evaluation data are recorded with far-field microphones in noisy and reverberant rooms, the training data are telephone speech and close talking. Our approach to this challenge includes speech enhancement, neural network methods and acoustic model adaptation, We show that these techniques can successfully alleviate the performance degradation due to noisy audio and data mismatch.
引用
下载
收藏
页码:533 / 538
页数:6
相关论文
共 50 条
  • [41] A MULTIPITCH TRACKING ALGORITHM FOR NOISY AND REVERBERANT SPEECH
    Jin, Zhaozhang
    Wang, DeLiang
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4218 - 4221
  • [42] Speech Recognition by Denoising and Dereverberation Based on Spectral Subtraction in a Real Noisy Reverberant Environment
    Odani, Kyohei
    Wang, Longbiao
    Kai, Atsuhiko
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1250 - 1253
  • [43] Evaluating robust features on Deep Neural Networks for speech recognition in noisy and channel mismatched conditions
    Mitra, Vikramjit
    Wang, Wen
    Franco, Horacio
    Lei, Yun
    Bartels, Chris
    Graciarena, Martin
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 895 - 899
  • [44] Robust Audio-Visual Speech Recognition Under Noisy Audio-Video Conditions
    Stewart, Darryl
    Seymour, Rowan
    Pass, Adrian
    Ming, Ji
    IEEE TRANSACTIONS ON CYBERNETICS, 2014, 44 (02) : 175 - 184
  • [45] SPEECH REINFORCEMENT IN NOISY REVERBERANT CONDITIONS UNDER AN APPROXIMATION OF THE SHORT-TIME SII
    Hendriks, Richard C.
    Crespo, Joao B.
    Jensen, Jesper
    Taal, Cees H.
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4400 - 4404
  • [46] Enhancement of speech intelligibility under noisy reverberant conditions based on modulation spectrum concept
    Van Ngo, Thuan
    Ho, Tuan Vu
    Unoki, Masashi
    Kubo, Rieko
    Akagi, Masato
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 753 - 758
  • [47] A robust endpoint detection of speech for noisy environments with application to automatic speech recognition
    Bou-Ghazale, SE
    Assaleh, K
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 3808 - 3811
  • [48] Dynamic Features in the Linear Domain for Robust Automatic Speech Recognition in a Reverberant Environment
    Ichikawa, Osamu
    Fukuda, Takashi
    Tachibana, Ryuki
    Nishimura, Masafumi
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 44 - 47
  • [49] Fast HMM-driven Beamforming for Robust Speech Recognition in Reverberant Environments
    Hong, Wei-Tyng
    PROCEEDINGS OF 2014 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 2, 2014, : 529 - 532
  • [50] A robust speech recognition system for communication robots in noisy environments
    Ishi, Carlos Toshinori
    Matsuda, Shigeki
    Kanda, Takayuki
    Jitsuhiro, Takatoshi
    Ishiguro, Hiroshi
    Nakamura, Satoshi
    Hagita, Norihiro
    IEEE TRANSACTIONS ON ROBOTICS, 2008, 24 (03) : 759 - 763