Combining standard and throat microphones for robust speech recognition

被引:66
|
作者
Graciarena, M
Franco, H
Sonmez, K
Bratt, H
机构
[1] SRI Int, Speech Technol & Res Lab, Menlo Pk, CA 94025 USA
[2] Univ Buenos Aires, Sch Engn, Inst Biomed Engn, RA-1053 Buenos Aires, DF, Argentina
关键词
noise robustness; probabilistic optimum filtering; speech recognition; throat microphone;
D O I
10.1109/LSP.2003.808549
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We present a method to combine the standard and throat microphone signals for robust speech recognition in noisy environments. Our approach is to use the. probabilistic optimum filter (POF) mapping algorithm to estimate the standard microphone clean-speech feature vectors, used by standard speech recognizers, from both microphones' noisy-speech feature vectors. A small untranscribed "stereo" database (noisy and clean simultaneous recordings) is required to train the POF mappings. In continuous-speech recognition experiments using SRI International's DECIPHER recognition system, both using artificially added noise and using recorded noisy speech, the combined-microphone approach significantly outperforms the single-microphone approach.
引用
收藏
页码:72 / 74
页数:3
相关论文
共 50 条
  • [31] COMBINING SPEECH RECOGNITION AND ACOUSTIC WORD EMOTION MODELS FOR ROBUST TEXT-INDEPENDENT EMOTION RECOGNITION
    Schuller, Bjoern
    Vlasenko, Bogdan
    Arsic, Dejan
    Rigoll, Gerhard
    Wendemuth, Andreas
    2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 1333 - +
  • [32] On the Potential of Channel Selection for Recognition of Reverberated Speech with Multiple Microphones
    Wolf, Martin
    Nadeu, Climent
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 574 - 577
  • [33] In-car speech recognition using distributed multiple microphones
    Li, WF
    Nishino, T
    Miyajima, C
    Itou, K
    Takeda, K
    Itakura, F
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2004, PT 1, PROCEEDINGS, 2004, 3331 : 505 - 513
  • [34] Combining Frame and Turn-Level Information for Robust Recognition of Emotions within Speech
    Vlasenko, Bogdan
    Schuller, Bjoern
    Wendemuth, Andreas
    Rigoll, Gerhard
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2712 - +
  • [35] Robust distant speech recognition by combining position-dependent CMN with conventional CMN
    Wang, Longbiao
    Kitaoka, Norihide
    Nakagawa, Seiichi
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 817 - +
  • [36] MAGNETIC THROAT MICROPHONES OF HIGH SENSITIVITY
    MARTIN, DW
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1947, 19 (01): : 43 - 50
  • [37] MAGNETIC THROAT MICROPHONES OF HIGH SENSITIVITY
    MARTIN, DW
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1946, 18 (01): : 253 - 253
  • [38] Robust variational speech separation using fewer microphones than speakers
    Rennie, S
    Aarabi, P
    Kristjansson, T
    Frey, BJ
    Achan, K
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 88 - 91
  • [39] Speech parameters for the robust emotional speech recognition
    Kim W.-G.
    Journal of Institute of Control, Robotics and Systems, 2010, 16 (12) : 1137 - 1142
  • [40] Robust recognition of fast speech
    Lee, Ki-Seung
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (08) : 2456 - 2459