Use of microphone array and model adaptation for hands-free speech acquisition and recognition

被引:3
|
作者
Chien, JT [1 ]
Lai, JR [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan
关键词
microphone array; delay-and-sum beamformer; coherence measure; model adaptation; speech enhancement; speech recognition;
D O I
10.1023/B:VLSI.0000015093.07192.eb
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a combined microphone array and model adaptation algorithm for hands-free speech recognition. Our purpose is to remove the inconvenience of using head-mounted/hand-holding microphone in conventional speech recognizer. To improve the speech quality with car noise interference, a linear microphone array is applied and acted as robust acquisition system. A time-domain coherence measure (TDCM) is applied to reliably estimate the time delay for speech signals collected by different microphones. The estimated delay is adopted in a delay-and-sum beamformer for speech enhancement. Further, we adapt the speech hidden Markov models to get close to the acoustic conditions of the enhanced test speech for robust speech recognition. In acquisition and recognition experiments using connected Chinese digits, we found that TDCM can effectively estimate the time delay. The increase in the speech sampling rate is helpful to determine the time delay. Incorporating the model adaptation scheme significantly reduces the recognition errors with moderate computation overhead.
引用
收藏
页码:141 / 151
页数:11
相关论文
共 50 条
  • [41] Speech and Hands-free Interaction: Myths, Challenges, and Opportunities
    Munteanu, Cosmin
    Penn, Gerald
    PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON HUMAN-COMPUTER INTERACTION WITH MOBILE DEVICES AND SERVICES (MOBILEHCI '17), 2017,
  • [42] HANDS-FREE SPEECH-SOUND INTERACTIONS AT HOME
    Milhorat, P.
    Istrate, D.
    Boudy, J.
    Chollet, G.
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 1678 - 1682
  • [43] A robust speech detection algorithm for speech activated hands-free applications
    Wu, D
    Tanaka, M
    Chen, R
    Olorenshaw, L
    Amador, M
    Menendez-Pidal, X
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 2407 - 2410
  • [44] Microphone Array Processing for Distant Speech Recognition
    Kumatani, Kenichi
    McDonough, John
    Raj, Bhiksha
    IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 127 - 140
  • [45] A DIGITAL MICROPHONE ARRAY FOR DISTANT SPEECH RECOGNITION
    Zwyssig, Erich
    Lincoln, Mike
    Renals, Steve
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5106 - 5109
  • [46] Experiments of speech recognition in a noisy and reverberant environment using a microphone array and HMM adaptation
    Giuliani, D
    Omologo, M
    Svaizer, P
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1329 - 1332
  • [47] Noise-robust hands-free speech recognition using SIMO-model-based blind source separation
    Mori, Y.
    Takatani, T.
    Saruwatari, H.
    Shikano, K.
    Hiekata, T.
    Morita, T.
    2007 9TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1-3, 2007, : 1290 - +
  • [48] Design and collection of acoustic sound data for hands-free speech recognition and sound scene understanding
    Nakamura, S
    Hiyane, K
    Asano, F
    Kaneda, Y
    Yamada, T
    Nishiura, T
    Kobayashi, T
    Ise, S
    Saruwatari, H
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : A161 - A164
  • [49] Microphone array speech recognition: Experiments on overlapping speech in meetings
    Moore, DC
    McCowan, IA
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO AND ELECTROACOUSTICS MULTIMEDIA SIGNAL PROCESSING, 2003, : 497 - 500
  • [50] DUAL MICROPHONE NOISE PSD ESTIMATION FOR MOBILE PHONES IN HANDS-FREE POSITION EXPLOITING THE COHERENCE AND SPEECH PRESENCE PROBABILITY
    Nelke, Christoph Matthias
    Beaugeant, Christophe
    Vary, Peter
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7279 - 7283