Mixtures of Bayesian Joint Factor Analyzers for Noise Robust Automatic Speech Recognition

被引:0
|
作者
Cui, Xiaodong [1 ]
Goel, Vaibhava [1 ]
Kingsbury, Brian [1 ]
机构
[1] IBM T J Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
Bayesian joint factor analysis; automatic relevance determination; relevance vector machine; noise robustness; LVCSR; SPEAKER; VARIABILITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates a noise robust approach to automatic speech recognition based on a mixture of Bayesian joint factor analyzers. In this approach, noisy features are modeled by two joint groups of factors accounting for speaker and noise variabilities which are estimated by clean and noisy speech respectively. The factors form an overcomplete dictionary with a redundant representation. Automatic relevance determination (ARD) is carried out by the relevance vector machine (RVM) where sparsity-promoting priors are applied on two factor loading matrices. Experiments on large vocabulary continuous speech recognition (LVCSR) tasks show good improvements by this approach.
引用
收藏
页码:3011 / 3015
页数:5
相关论文
共 50 条
  • [31] Joint Training of Speech Separation, Filterbank and Acoustic Model for Robust Automatic Speech Recognition
    Wang, Zhong-Qiu
    Wang, DeLiang
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2839 - 2843
  • [32] Joint Adaptation and Adaptive Training of TVWR for Robust Automatic Speech Recognition
    Liu, Shilin
    Sim, Khe Chai
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 636 - 640
  • [33] Maximum likelihood joint estimation of channel and noise for robust speech recognition
    Zhao, YX
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1109 - 1112
  • [34] Joint Uncertainty Decoding With Predictive Methods for Noise Robust Speech Recognition
    Xu, Haitian
    Gales, Mark J. F.
    Chin, K. K.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (06): : 1665 - 1676
  • [35] GESTURE-BASED DYNAMIC BAYESIAN NETWORK FOR NOISE ROBUST SPEECH RECOGNITION
    Mitra, Vikramjit
    Nam, Hosung
    Espy-Wilson, Carol Y.
    Saltzman, Elliot
    Goldstein, Louis
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5172 - 5175
  • [36] Noise Robust Speech Features for Automatic Continuous Speech Recognition using Running Spectrum Analysis
    Ohnuki, Kazunaga
    Takahashi, Wataru
    Yoshizawa, Shingo
    Miyanaga, Yoshikazu
    2008 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES, 2008, : 150 - 153
  • [37] Sparse coding of the modulation spectrum for noise-robust automatic speech recognition
    Sara Ahmadi
    Seyed Mohammad Ahadi
    Bert Cranen
    Lou Boves
    EURASIP Journal on Audio, Speech, and Music Processing, 2014
  • [38] Novel frequency masking curves for noise-robust automatic speech recognition
    Chen, Chia-Ping
    Yeh, Ja-Zang
    Wu, Bo-Feng
    JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2013, 36 (06) : 696 - 703
  • [39] Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition
    Gemmeke, Jort F.
    Virtanen, Tuomas
    Hurmalainen, Antti
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (07): : 2067 - 2080
  • [40] Sparse coding of the modulation spectrum for noise-robust automatic speech recognition
    Ahmadi, Sara
    Ahadi, Seyed Mohammad
    Cranen, Bert
    Boves, Lou
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014, : 1 - 20