Toward Robust Speech Recognition and Understanding

被引:0
|
作者
Sadaoki Furui
机构
[1] Tokyo Institute of Technology,Department of Computer Science
关键词
speech recognition; speech understanding; robustness; adaptation; spontaneous speech; corpus; acoustic models; language models; dialogue; multi-modal; summarization;
D O I
暂无
中图分类号
学科分类号
摘要
The principal cause of speech recognition errors is a mismatch between trained acoustic/language models and input speech due to the limited amount of training data in comparison with the vast variation of speech. It is crucial to establish methods that are robust against voice variation due to individuality, the physical and psychological condition of the speaker, telephone sets, microphones, network characteristics, additive background noise, speaking styles, and other aspects. This paper overviews robust architecture and modeling techniques for speech recognition and understanding. The topics include acoustic and language modeling for spontaneous speech recognition, unsupervised adaptation of acoustic and language models, robust architecture for spoken dialogue systems, multi-modal speech recognition, and speech summarization. This paper also discusses the most important research problems to be solved in order to achieve ultimate robust speech recognition and understanding systems.
引用
收藏
页码:245 / 254
页数:9
相关论文
共 50 条
  • [21] Robust recognition of noisy speech using speech enhancement
    Xu, YF
    Zhang, JJ
    Yao, KS
    Cao, ZG
    Ma, ZX
    2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 734 - 737
  • [22] ACOUSTICAL PREPROCESSING FOR ROBUST SPEECH RECOGNITION
    STERN, RM
    ACERO, A
    SPEECH AND NATURAL LANGUAGE, 1989, : 311 - 318
  • [23] An auditory model for robust speech recognition
    Luo, Xuewen
    Soon, Ing Yann
    Yeo, Chai Kiat
    2008 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING, VOLS 1 AND 2, PROCEEDINGS, 2008, : 1105 - 1109
  • [24] Robust speech recognition in telephone network
    Han, MS
    Park, GB
    Park, JG
    Han, JQ
    PROGRESS IN CONNECTIONIST-BASED INFORMATION SYSTEMS, VOLS 1 AND 2, 1998, : 1103 - 1106
  • [25] Robust recognition of children's speech
    Potamianos, A
    Narayan, S
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (06): : 603 - 616
  • [26] Feature extraction for robust speech recognition
    Dharanipragada, S
    2002 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II, PROCEEDINGS, 2002, : 855 - 858
  • [27] Pitch restoration for robust speech recognition
    Lima, C
    Tavares, A
    Silva, C
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANAGUAGE, PROCEEDINGS, 2003, 2721 : 18 - 22
  • [28] Robust recognition of emotion from speech
    Hoque, Mohammed E.
    Yeasin, Mohammed
    Louwerse, Max M.
    INTELLIGENT VIRTUAL AGENTS, PROCEEDINGS, 2006, 4133 : 42 - 53
  • [29] Special issue on robust speech recognition
    Junqua, JC
    Haton, JP
    SPEECH COMMUNICATION, 1998, 25 (1-3) : 1 - 2
  • [30] Robust Mizo Continuous Speech Recognition
    Dey, Abhishek
    Sarma, Biswajit Dev
    Lalhminghlui, Wendy
    Ngente, Lalnunsiami
    Gogoi, Parismita
    Sarmah, Priyankoo
    Prasanna, S. R. M.
    Sinha, Rohit
    Nirmala, S. R.
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1036 - 1040