Use of microphone array and model adaptation for hands-free speech acquisition and recognition

被引：3

作者：

Chien, JT ^{[1
]}

Lai, JR ^{[1
]}

机构：

[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan

来源：

JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY | 2004年 / 36卷 / 2-3期

关键词：

microphone array; delay-and-sum beamformer; coherence measure; model adaptation; speech enhancement; speech recognition;

D O I：

10.1023/B:VLSI.0000015093.07192.eb

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents a combined microphone array and model adaptation algorithm for hands-free speech recognition. Our purpose is to remove the inconvenience of using head-mounted/hand-holding microphone in conventional speech recognizer. To improve the speech quality with car noise interference, a linear microphone array is applied and acted as robust acquisition system. A time-domain coherence measure (TDCM) is applied to reliably estimate the time delay for speech signals collected by different microphones. The estimated delay is adopted in a delay-and-sum beamformer for speech enhancement. Further, we adapt the speech hidden Markov models to get close to the acoustic conditions of the enhanced test speech for robust speech recognition. In acquisition and recognition experiments using connected Chinese digits, we found that TDCM can effectively estimate the time delay. The increase in the speech sampling rate is helpful to determine the time delay. Incorporating the model adaptation scheme significantly reduces the recognition errors with moderate computation overhead.

引用

页码：141 / 151

页数：11

共 50 条

[21] HMM adaptation and microphone array processing for distant speech recognition
Kleban, J
Gong, YF
2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1411 - 1414
[22] Experiments of in-car audio compensation for hands-free speech recognition
Matassoni, M
Omologo, M
Zieger, C
ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 369 - 374
[23] IMPROVED HANDS-FREE AUTOMATIC SPEECH RECOGNITION IN REVERBERANT ENVIRONMENT CONDITION
Gomez, Randy
Nakamura, Keisuke
Mizumoto, Takeshi
Nakadai, Kazuhiro
2014 4TH JOINT WORKSHOP ON HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS (HSCMA), 2014, : 67 - 71
[24] Likelihood-maximizing beamforming for robust hands-free speech recognition
Seltzer, ML
Raj, B
Stern, RM
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (05): : 489 - 498
[25] Speech enhancement for hands-free terminals
Grbic, N
Nordholm, S
Johansson, A
ISPA 2001: PROCEEDINGS OF THE 2ND INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS, 2001, : 435 - 440
[26] Study of microphone system for hands-free teleconferencing units
Nakagawa, Akira
Shimauchi, Suehiro
Makino, Shoji
Journal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi), 2000, 21 (01): : 33 - 35
[27] HANDS-FREE SPEECH RECOGNITION CHALLENGE FOR REAL-WORLD SPEECH DIALOGUE SYSTEMS
Saruwatari, Hiroshi
Kawanami, Hiromichi
Takeuchi, Shota
Takahashi, Yu
Cincarek, Tobias
Shikano, Kiyohiro
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3729 - 3732
[28] Experiments of hands - Free connected digit recognition using a microphone array
Omologo, M
Matassoni, M
Svaizer, P
Giuliani, D
1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 490 - 497
[29] Noise-robust hands-free speech recognition based on spatial subtraction array and known noise superimposition
Ohashi, Y
Nishikawa, T
Saruwatari, H
Lee, A
Shikano, K
2005 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4, 2005, : 533 - 537
[30] APPLICATION OF A HEAD-CONTACT MICROPHONE TO HANDS-FREE SPEECH COMMUNICATION UNDER HAZARDOUS CONDITIONS
SEBESTA, GJ
MELLEN, AJ
HOFER, A
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1967, 41 (06): : 1616 - +

← 1 2 3 4 5 →