Toward Robust Speech Recognition and Understanding

被引：0

作者：

Sadaoki Furui

机构：

[1] Tokyo Institute of Technology,Department of Computer Science

来源：

Journal of VLSI signal processing systems for signal, image and video technology | 2005年 / 41卷

关键词：

speech recognition; speech understanding; robustness; adaptation; spontaneous speech; corpus; acoustic models; language models; dialogue; multi-modal; summarization;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The principal cause of speech recognition errors is a mismatch between trained acoustic/language models and input speech due to the limited amount of training data in comparison with the vast variation of speech. It is crucial to establish methods that are robust against voice variation due to individuality, the physical and psychological condition of the speaker, telephone sets, microphones, network characteristics, additive background noise, speaking styles, and other aspects. This paper overviews robust architecture and modeling techniques for speech recognition and understanding. The topics include acoustic and language modeling for spontaneous speech recognition, unsupervised adaptation of acoustic and language models, robust architecture for spoken dialogue systems, multi-modal speech recognition, and speech summarization. This paper also discusses the most important research problems to be solved in order to achieve ultimate robust speech recognition and understanding systems.

引用

页码：245 / 254

页数：9

共 50 条

[41] Recognizing articulatory gestures from speech for robust speech recognition
Mitra, Vikramjit
Nam, Hosung
Espy-Wilson, Carol
Saltzman, Elliot
Goldstein, Louis
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2012, 131 (03): : 2270 - 2287
[42] Joint decoding of multiple speech patterns for robust speech recognition
Nair, Nishanth Ulhas
Sreenivas, T. V.
2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 93 - 98
[43] REINFORCEMENT LEARNING BASED SPEECH ENHANCEMENT FOR ROBUST SPEECH RECOGNITION
Shen, Yih-Liang
Huang, Chao-Yuan
Wang, Syu-Siang
Tsao, Yu
Wang, Hsin-Min
Chi, Tai-Shih
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6750 - 6754
[44] Enhancing the magnitude spectrum of speech features for robust speech recognition
Hung, Jeih-weih
Fan, Hao-teng
Tu, Wen-hsiang
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2012,
[45] Temporal structure normalization of speech feature for robust speech recognition
Xiao, Xiong
Chng, Eng Siong
Li, Haizhou
IEEE SIGNAL PROCESSING LETTERS, 2007, 14 (07) : 500 - 503
[46] Enhancing the magnitude spectrum of speech features for robust speech recognition
Jeih-weih Hung
Hao-teng Fan
Wen-hsiang Tu
EURASIP Journal on Advances in Signal Processing, 2012
[47] A STUDY ON DATA AUGMENTATION OF REVERBERANT SPEECH FOR ROBUST SPEECH RECOGNITION
Ko, Tom
Peddinti, Vijayaditya
Povey, Daniel
Seltzer, Michael L.
Khudanpur, Sanjeev
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5220 - 5224
[48] Noise-Robust speech recognition of Conversational Telephone Speech
Chen, Gang
Tolba, Hesham
O'Shaughnessy, Douglas
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1101 - 1104
[49] Robust speech detection method for telephone speech recognition system
ATR Interpreting Telecommunications, Research Lab, Kyoto, Japan
Speech Commun, 2 (135-148):
[50] Robust Speech Recognition with Speech Enhanced Deep Neural Networks
Du, Jun
Wang, Qing
Gao, Tian
Xu, Yong
Dai, Lirong
Lee, Chin-Hui
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 616 - 620

← 1 2 3 4 5 →