Toward Robust Speech Recognition and Understanding

被引:0
|
作者
Sadaoki Furui
机构
[1] Tokyo Institute of Technology,Department of Computer Science
关键词
speech recognition; speech understanding; robustness; adaptation; spontaneous speech; corpus; acoustic models; language models; dialogue; multi-modal; summarization;
D O I
暂无
中图分类号
学科分类号
摘要
The principal cause of speech recognition errors is a mismatch between trained acoustic/language models and input speech due to the limited amount of training data in comparison with the vast variation of speech. It is crucial to establish methods that are robust against voice variation due to individuality, the physical and psychological condition of the speaker, telephone sets, microphones, network characteristics, additive background noise, speaking styles, and other aspects. This paper overviews robust architecture and modeling techniques for speech recognition and understanding. The topics include acoustic and language modeling for spontaneous speech recognition, unsupervised adaptation of acoustic and language models, robust architecture for spoken dialogue systems, multi-modal speech recognition, and speech summarization. This paper also discusses the most important research problems to be solved in order to achieve ultimate robust speech recognition and understanding systems.
引用
收藏
页码:245 / 254
页数:9
相关论文
共 50 条
  • [41] Recognizing articulatory gestures from speech for robust speech recognition
    Mitra, Vikramjit
    Nam, Hosung
    Espy-Wilson, Carol
    Saltzman, Elliot
    Goldstein, Louis
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2012, 131 (03): : 2270 - 2287
  • [42] Joint decoding of multiple speech patterns for robust speech recognition
    Nair, Nishanth Ulhas
    Sreenivas, T. V.
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 93 - 98
  • [43] REINFORCEMENT LEARNING BASED SPEECH ENHANCEMENT FOR ROBUST SPEECH RECOGNITION
    Shen, Yih-Liang
    Huang, Chao-Yuan
    Wang, Syu-Siang
    Tsao, Yu
    Wang, Hsin-Min
    Chi, Tai-Shih
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6750 - 6754
  • [44] Enhancing the magnitude spectrum of speech features for robust speech recognition
    Hung, Jeih-weih
    Fan, Hao-teng
    Tu, Wen-hsiang
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2012,
  • [45] Temporal structure normalization of speech feature for robust speech recognition
    Xiao, Xiong
    Chng, Eng Siong
    Li, Haizhou
    IEEE SIGNAL PROCESSING LETTERS, 2007, 14 (07) : 500 - 503
  • [46] Enhancing the magnitude spectrum of speech features for robust speech recognition
    Jeih-weih Hung
    Hao-teng Fan
    Wen-hsiang Tu
    EURASIP Journal on Advances in Signal Processing, 2012
  • [47] A STUDY ON DATA AUGMENTATION OF REVERBERANT SPEECH FOR ROBUST SPEECH RECOGNITION
    Ko, Tom
    Peddinti, Vijayaditya
    Povey, Daniel
    Seltzer, Michael L.
    Khudanpur, Sanjeev
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5220 - 5224
  • [48] Noise-Robust speech recognition of Conversational Telephone Speech
    Chen, Gang
    Tolba, Hesham
    O'Shaughnessy, Douglas
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1101 - 1104
  • [49] Robust speech detection method for telephone speech recognition system
    ATR Interpreting Telecommunications, Research Lab, Kyoto, Japan
    Speech Commun, 2 (135-148):
  • [50] Robust Speech Recognition with Speech Enhanced Deep Neural Networks
    Du, Jun
    Wang, Qing
    Gao, Tian
    Xu, Yong
    Dai, Lirong
    Lee, Chin-Hui
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 616 - 620