On the selection of the impulse responses for distant-speech recognition based on contaminated speech training

被引:0
|
作者
Ravanelli, Mirco [1 ]
Omologo, Maurizio [1 ]
机构
[1] Fdn Bruno Kessler, Trento, Italy
来源
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 | 2014年
关键词
robust speech recognition; multi-condition training; reverberation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Distant-speech recognition represents a technology of fundamental importance for future development of assistive applications characterized by flexible and unobtrusive interaction in home environments. State-of-the-art speech recognition still exhibits lack of robustness, and an unacceptable performance variability, due to environmental noise, reverberation effects, and speaker position. In the past, multi-condition training and contamination methods were explored to reduce the mismatch between training and test conditions. However, the performance evaluation can be biased by factors as limited number of positions of speaker and microphones, adopted set of impulse responses, vocabulary and grammars defining the recognition task. The purpose of this paper is to investigate in more detail some critical aspects that characterize such experimental context. To this purpose, our work addressed a microphone network distributed over different rooms of an apartment and a related set of speaker-microphone pairs leading to a very large set of impulse responses. Besides simulations, the experiments also tackled real speech interactions. The performance evaluation was based on a phone-loop task, in order to minimize the influence of linguistic constraints. The experimental results show how less critical is an accurate selection of impulse responses, if compared to other factors as the signal-to-noise ratio introduced by additive background noise.
引用
收藏
页码:1028 / 1032
页数:5
相关论文
共 50 条
  • [21] Training of Automatic Speech Recognition System on Noised Speech
    Prodeus, Arkadiy
    Kukharicheva, Kateryna
    2016 4TH INTERNATIONAL CONFERENCE ON METHODS AND SYSTEMS OF NAVIGATION AND MOTION CONTROL (MSNMC), 2016, : 221 - 223
  • [22] Automatic Speech Recognition Performance for Training on Noised Speech
    Prodeus, Arkadiy
    Kukharicheva, Kateryna
    2017 2ND IEEE INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION AND COMMUNICATION TECHNOLOGIES-2017 (AICT 2017), 2017, : 71 - 74
  • [23] Audio-visual speech recognition based on joint training with audio-visual speech enhancement for robust speech recognition
    Hwang, Jung-Wook
    Park, Jeongkyun
    Park, Rae-Hong
    Park, Hyung-Min
    APPLIED ACOUSTICS, 2023, 211
  • [24] A DIGITAL MICROPHONE ARRAY FOR DISTANT SPEECH RECOGNITION
    Zwyssig, Erich
    Lincoln, Mike
    Renals, Steve
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5106 - 5109
  • [25] Robust distant-talking speech recognition
    Lin, Q
    Che, C
    Yuk, DS
    Jin, L
    deVries, B
    Pearson, J
    Flanagan, J
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 21 - 24
  • [26] Convolutional Neural Networks for Distant Speech Recognition
    Swietojanski, Pawel
    Ghoshal, Arnab
    Renals, Steve
    IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (09) : 1120 - 1124
  • [27] Learning to Rank Microphones for Distant Speech Recognition
    Cornell, Samuele
    Brutti, Alessio
    Matassoni, Marco
    Squartini, Stefano
    INTERSPEECH 2021, 2021, : 3855 - 3859
  • [28] Microphone Array Processing for Distant Speech Recognition
    Kumatani, Kenichi
    McDonough, John
    Raj, Bhiksha
    IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 127 - 140
  • [29] Strategies for distant speech recognition in reverberant environments
    Delcroix, Marc
    Yoshioka, Takuya
    Ogawa, Atsunori
    Kubo, Yotaro
    Fujimoto, Masakiyo
    Ito, Nobutaka
    Kinoshita, Keisuke
    Espi, Miquel
    Araki, Shoko
    Hori, Takaaki
    Nakatani, Tomohiro
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2015,
  • [30] Data selection for speech recognition
    Wu, Yi
    Zhang, Rong
    Rudnicky, Alexander
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 562 - 565