A noise-robust speech input interface for information kiosk terminals

被引：1

作者：

Ida, M ^{[1
]}

Mori, H

Nakamura, S

Shikano, K

机构：

[1] OMRON Co, Informat Technol Res Ctr, Kyoto 6008530, Japan

[2] Nara Inst Sci & Technol, Grad Sch Informat Sci, Ikoma 6300101, Japan

来源：

ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS | 2004年 / 87卷 / 12期

关键词：

speech recognition; noise reduction; microphone array; spectral subtraction;

D O I：

10.1002/ecjb.20135

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In social amenity apparatus such as information terminals or oil-line shopping terminals and ticket vending machines, more natural and smoother communication between machines and humans is made possible by incorporating speech recognition, thus rapidly enhancing operation and convenience. In this paper, we will describe the trial production of an information kiosk terminal equipped with speech recognition and the noise reduction technique adopted as a speech input interface. In general, the aforementioned system equipment is installed in public spaces such as stations. Therefore, robust speech recognition performance is required even in the worst noise environments. For the realization of this antinoise capability, we have used noise reduction based on both a microphone array and spectral subtraction (SS). Experiments with evaluation data recorded in real environments indicate that a recognition performance of over 90% in isolated-word recognition of 216 words uttered by all unspecified speaker is achieved by the combined use of a 32-channel microphone array and SS. (C) 2004 Wiley Periodicals, Inc.

引用

页码：51 / 61

页数：11

共 50 条

[1] Noise-robust speech triage
Bartos, Anthony L.
Cipr, Tomas
Nelson, Douglas J.
Schwarz, Petr
Banowetz, John
Jerabek, Ladislav
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 143 (04): : 2313 - 2320
[2] Probabilistic vector mapping with trajectory information for noise-robust speech recognition
Kim, DY
Un, CK
[J]. ELECTRONICS LETTERS, 1996, 32 (17) : 1550 - 1551
[3] Noise-Robust speech recognition of Conversational Telephone Speech
Chen, Gang
Tolba, Hesham
O'Shaughnessy, Douglas
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1101 - 1104
[4] An overview of noise-robust automatic speech recognition
Li, Jinyu
Deng, Li
Gong, Yifan
Haeb-Umbach, Reinhold
[J]. IEEE Transactions on Audio, Speech and Language Processing, 2014, 22 (04): : 745 - 777
[5] Covariance Modelling for Noise-Robust Speech Recognition
van Dalen, R. C.
Gales, M. J. F.
[J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2000 - 2003
[6] EXTENDED VTS FOR NOISE-ROBUST SPEECH RECOGNITION
van Dalen, R. C.
Gales, M. J. F.
[J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3829 - 3832
[7] An Overview of Noise-Robust Automatic Speech Recognition
Li, Jinyu
Deng, Li
Gong, Yifan
Haeb-Umbach, Reinhold
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (04) : 745 - 777
[8] Frame decorrelation for noise-robust speech recognition
Jung, HY
Kim, DY
Un, CK
[J]. ELECTRONICS LETTERS, 1996, 32 (13) : 1163 - 1164
[9] Frame decorrelation for noise-robust speech recognition
Korea Advanced Inst of Science and, Technology, Taejon, Korea, Republic of
[J]. Electron Lett, 13 (1163-1164):
[10] Extended VTS for Noise-Robust Speech Recognition
van Dalen, Rogier C.
Gales, Mark J. F.
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 733 - 743

← 1 2 3 4 5 →