Environmental conditions and acoustic transduction in hands-free speech recognition

被引:54
|
作者
Omologo, M [1 ]
Svaizer, P [1 ]
Matassoni, M [1 ]
机构
[1] Ist Ric Sci & Tecnol, I-38050 Trento, Italy
关键词
hands-free speech recognition; robustness; environmental noise; microphone arrays; acoustics; MAP adaptation;
D O I
10.1016/S0167-6393(98)00030-2
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Hands-free interaction represents a key-point for increase of flexibility of present applications and for the development of new speech recognition applications, where the user cannot be encumbered by either hand-held or head-mounted microphones. When the microphone is far from the speaker, the transduced signal is affected by degradation of different nature, that is often unpredictable. Special microphones and multi-microphone acquisition systems represent a way of reducing some environmental noise effects. Robust processing and adaptation techniques can be further used in order to compensate for different kinds of variability that may be present in the recognizer input. The purpose of this paper is to re-visit some of the assumptions about the different sources of this variability and to discuss both on special transducer systems and on compensation/adaptation techniques that can be adopted. In particular, the paper will refer to the use of multi-microphone systems to overcome some undesired effects caused by room acoustics (e.g. reverberation) and by coherent/incoherent noise (e.g. competitive talkers, computer fans). The paper concludes with the description of some experiments that were conducted both on real and simulated speech data. (C) 1998 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:75 / 95
页数:21
相关论文
共 50 条
  • [1] Fast dereverberation for hands-free speech recognition
    Gomez, Randy
    Even, Jani
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    2008 HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS, 2008, : 141 - +
  • [2] Design and collection of acoustic sound data for hands-free speech recognition and sound scene understanding
    Nakamura, S
    Hiyane, K
    Asano, F
    Kaneda, Y
    Yamada, T
    Nishiura, T
    Kobayashi, T
    Ise, S
    Saruwatari, H
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : A161 - A164
  • [3] Training of HMM with filtered speech material for hands-free recognition
    ITC-IRST, Trento, Italy
    ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, (449-452):
  • [4] Training of HMM with filtered speech material for hands-free recognition
    Giuliani, D
    Matassoni, M
    Omologo, M
    Svaizer, P
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 449 - 452
  • [5] Defeating reverberation: Advanced dereverberation and recognition techniques for hands-free speech recognition
    Delcroix, Marc
    Yoshioka, Takuya
    Ogawa, Atsunori
    Kubo, Yotaro
    Fujimoto, Masakiyo
    Ito, Nobutaka
    Kinoshita, Keisuke
    Espi, Miquel
    Araki, Shoko
    Hori, Takaaki
    Nakatani, Tomohiro
    2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 522 - 526
  • [6] Experiments of in-car audio compensation for hands-free speech recognition
    Matassoni, M
    Omologo, M
    Zieger, C
    ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 369 - 374
  • [7] IMPROVED HANDS-FREE AUTOMATIC SPEECH RECOGNITION IN REVERBERANT ENVIRONMENT CONDITION
    Gomez, Randy
    Nakamura, Keisuke
    Mizumoto, Takeshi
    Nakadai, Kazuhiro
    2014 4TH JOINT WORKSHOP ON HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS (HSCMA), 2014, : 67 - 71
  • [8] Likelihood-maximizing beamforming for robust hands-free speech recognition
    Seltzer, ML
    Raj, B
    Stern, RM
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (05): : 489 - 498
  • [9] Speech enhancement for hands-free terminals
    Grbic, N
    Nordholm, S
    Johansson, A
    ISPA 2001: PROCEEDINGS OF THE 2ND INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS, 2001, : 435 - 440
  • [10] HANDS-FREE SPEECH RECOGNITION CHALLENGE FOR REAL-WORLD SPEECH DIALOGUE SYSTEMS
    Saruwatari, Hiroshi
    Kawanami, Hiromichi
    Takeuchi, Shota
    Takahashi, Yu
    Cincarek, Tobias
    Shikano, Kiyohiro
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3729 - 3732