A Bayesian prediction approach to robust speech recognition and online environmental learning

被引:5
|
作者
Chien, JT [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan
关键词
Bayesian predictive classification (BPC); online unsupervised learning; speaker adaptation; speech recognition; hidden Markov model;
D O I
10.1016/S0167-6393(01)00032-2
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A robust speech recognizer is developed to tackle the inevitable mismatch between training and testing environments. Because the realistic environments are uncertain and nonstationary, it is necessary to characterize the uncertainty of speech hidden Markov models (HMMs) for recognition and trace the uncertainty incrementally to catch the newest environmental statistics. In this paper, we develop a new Bayesian predictive classification (BPC) for robust decision and online environmental learning. The BPC decision is adequately established by modeling the uncertainties of both the HMM mean rector and precision matrix using a conjugate prior density. The frame-based predictive distributions using multivariate t distributions and approximate Gaussian distributions are herein exploited. After the recognition, the prior density is pooled with the likelihood of the Current test sentence to generate the reproducible prior density. The hyperparameters of the prior density are accordingly adjusted to meet the newest environments and apply for the recognition of upcoming data. As a result, an efficient online unsupervised learning strategy is developed for HMM-based speech recognition without needing adaptation data. In the experiments, the proposed approach is significantly better than conventional plug-in maximum a posteriori (MAP) decision on the recognition of connected Chinese digits in hands-free car environments. This approach is economical in computation. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
下载
收藏
页码:321 / 334
页数:14
相关论文
共 50 条
  • [21] A Bayesian Approach to Robust Reinforcement Learning
    Derman, Esther
    Mankowitz, Daniel
    Mann, Timothy
    Mannor, Shie
    35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 648 - 658
  • [22] Optimal online learning: a Bayesian approach
    Solla, SA
    Winther, O
    COMPUTER PHYSICS COMMUNICATIONS, 1999, 121 : 94 - 97
  • [23] REINFORCEMENT LEARNING BASED SPEECH ENHANCEMENT FOR ROBUST SPEECH RECOGNITION
    Shen, Yih-Liang
    Huang, Chao-Yuan
    Wang, Syu-Siang
    Tsao, Yu
    Wang, Hsin-Min
    Chi, Tai-Shih
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6750 - 6754
  • [24] Deep Learning for Environmentally Robust Speech Recognition
    Alhamada, A., I
    Khalifa, O. O.
    Abdalla, A. H.
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON ELECTRONIC DEVICES, SYSTEMS AND APPLICATIONS (ICEDSA2020), 2020, 2306
  • [25] Online Adaptive Learning for Speech Recognition Decoding
    Barnes, Jeff
    Lin, Hui
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1958 - 1961
  • [26] Bayesian Feature Enhancement for Reverberation and Noise Robust Speech Recognition
    Leutnant, Volker
    Krueger, Alexander
    Haeb-Umbach, Reinhold
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (08): : 1640 - 1652
  • [28] Histogram equalization with Bayesian estimation for noise robust speech recognition
    Suh, Youngjoo
    Kim, Hoirin
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 143 (02): : 677 - 685
  • [29] Robust speech recognition based on viterbi Bayesian predictive classification
    Jiang, H
    Hirose, K
    Huo, Q
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1551 - 1554
  • [30] A Strategic Approach for Robust Dysarthric Speech Recognition
    Revathi, A.
    Sasikaladevi, N.
    Arunprasanth, D.
    Amirtharajan, Rengarajan
    WIRELESS PERSONAL COMMUNICATIONS, 2024, 134 (04) : 2315 - 2346