ROBUST SPEECH RECOGNITION IN UNKNOWN REVERBERANT AND NOISY CONDITIONS

被引:0
|
作者
Hsiao, Roger [1 ]
Ma, Jeff [1 ]
Hartmann, William [1 ]
Karafiat, Martin [2 ]
Grezl, Frantisek [2 ]
Burget, Lukas [2 ]
Szoke, Igor [2 ]
Cernocky, Jan Honza [2 ]
Watanabe, Shinji [3 ]
Chen, Zhuo [3 ]
Mallidi, Sri Harish [4 ]
Hermansky, Hynek [4 ]
Tsakalidis, Stavros [1 ]
Schwartz, Richard [1 ]
机构
[1] Raytheon BBN Technol, Cambridge, MA 02138 USA
[2] Brno Univ Technol, Speech FIT & Ctr Excellence IT4I, CS-61090 Brno, Czech Republic
[3] Mitsubishi Elect Res Labs, Cambridge, MA USA
[4] Johns Hopkins Univ, Baltimore, MD USA
关键词
ASpIRE challenge; robust speech recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we describe our work on the ASpIRE (Automatic Speech recognition In Reverberant Environments) challenge, which aims to assess the robustness of automatic speech recognition (ASR) systems. The main characteristic of the challenge is developing a high-performance system without access to matched training and development data. While the evaluation data are recorded with far-field microphones in noisy and reverberant rooms, the training data are telephone speech and close talking. Our approach to this challenge includes speech enhancement, neural network methods and acoustic model adaptation, We show that these techniques can successfully alleviate the performance degradation due to noisy audio and data mismatch.
引用
下载
收藏
页码:533 / 538
页数:6
相关论文
共 50 条
  • [21] Robust Speech Recognition for Similar Japanese Pronunciation Phrases Under Noisy Conditions
    Mufungulwa, George
    Tsutsui, Hiroshi
    Miyanaga, Yoshikazu
    Abe, Shin-ichi
    Ochi, Mitsuru
    2017 INTERNATIONAL SYMPOSIUM ON SIGNALS, CIRCUITS AND SYSTEMS (ISSCS), 2017,
  • [22] MAXIMUM LIKELIHOOD PSD ESTIMATION FOR SPEECH ENHANCEMENT IN REVERBERANT AND NOISY CONDITIONS
    Kuklasinski, Adam
    Doclo, Simon
    Jensen, Jesper
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 599 - 603
  • [23] Contribution of modulation spectral features for cross-lingual speech emotion recognition under noisy reverberant conditions
    Guo, Taiyang
    Li, Sixia
    Kidani, Shunsuke
    Okada, Shogo
    Unoki, Masashi
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 2221 - 2227
  • [24] Experimental study of robust acoustic beamforming for speech acquisition in reverberant and noisy environments
    Zhao, Yingke
    Jensen, Jesper Rindom
    Jensen, Tobias Lindstrom
    Chen, Jingdong
    Christensen, Mads Graesboll
    APPLIED ACOUSTICS, 2020, 170
  • [25] Robust speaker recognition in noisy conditions
    Ming, Ji
    Hazen, Timothy J.
    Glass, James R.
    Reynolds, Douglas A.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (05): : 1711 - 1723
  • [26] IMPULSE RESPONSE ESTIMATION FOR ROBUST SPEECH RECOGNITION IN A REVERBERANT ENVIRONMENT
    Ravanelli, Mirco
    Sosi, Alessandro
    Svaizer, Piergiorgio
    Omologo, Maurizio
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 1668 - 1672
  • [27] An efficient joint training model for monaural noisy-reverberant speech recognition
    Lian, Xiaoyu
    Xia, Nan
    Dai, Gaole
    Yang, Hongqin
    Applied Acoustics, 2025, 228
  • [28] Improving Robustness of Speaker Recognition in Noisy and Reverberant Conditions via Training
    Al-Noori, Ahmed H.
    Al-Karawi, Khamis A.
    Li, Francis F.
    2015 European Intelligence and Security Informatics Conference (EISIC), 2015, : 180 - 180
  • [29] Speech Enhancement of Noisy and Reverberant Speech for Text-to-Speech
    Valentini-Botinhao, Cassia
    Yamagishi, Junichi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (08) : 1420 - 1433
  • [30] SPEECH RECOGNITION IN UNSEEN AND NOISY CHANNEL CONDITIONS
    Mitra, Vikramjit
    Franco, Horacio
    Bartels, Chris
    van Hout, Julien
    Graciarena, Martin
    Vergyri, Dimitra
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5215 - 5219