BAYESIAN AND GAUSSIAN PROCESS NEURAL NETWORKS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION

被引:0
|
作者
Hu, Shoukang [1 ]
Lam, Max W. Y. [1 ]
Xie, Xurong [1 ]
Liu, Shansong [1 ]
Yu, Jianwei [1 ]
Wu, Xixin [1 ]
Liu, Xunying [1 ]
Meng, Helen [1 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
来源
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年
关键词
Bayesian Neural Network; Gaussian Process Neural Network; activation function selection; speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The hidden activation functions inside deep neural networks ( DNNs) play a vital role in learning high level discriminative features and controlling the information flows to track longer history. However, the fixed model parameters used in standard DNNs can lead to over-fitting and poor generalization when given limited training data. Furthermore, the precise forms of activations used in DNNs are often manually set at a global level for all hidden nodes, thus lacking an automatic selection method. In order to address these issues, Bayesian neural networks ( BNNs) acoustic models are proposed in this paper to explicitly model the uncertainty associated with DNN parameters. Gaussian Process ( GP) activations based DNN and LSTM acoustic models are also used in this paper to allow the optimal forms of hidden activations to be stochastically learned for individual hidden nodes. An efficient variational inference based training algorithm is derived for BNN, GPNN and GPLSTM systems. Experiments were conducted on a LVCSR system trained on a 75 hour subset of Switchboard I data. The best BNN and GPNN systems outperformed both the baseline DNN systems constructed using fixed form activations and their combination via frame level joint decoding by 1% absolute in word error rate.
引用
收藏
页码:6555 / 6559
页数:5
相关论文
共 50 条
  • [1] Gaussian Process Neural Networks for Speech Recognition
    Lam, Max W. Y.
    Hu, Shoukang
    Xie, Xurong
    Liu, Shansong
    Yu, Jianwei
    Su, Rongfeng
    Liu, Xunying
    Meng, Helen
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1778 - 1782
  • [2] A Comparative Study of Gaussian Selection Methods in Large Vocabulary Continuous Speech Recognition
    Gehrig, Dirk
    Schaaf, Thomas
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 625 - +
  • [3] Use of Gaussian Selection in large vocabulary continuous speech recognition using HMMS
    Knill, KM
    Gales, MJF
    Young, SJ
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 470 - 473
  • [4] Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition
    Jaitly, Navdeep
    Patrick Nguyen
    Senior, Andrew
    Vanhoucke, Vincent
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2577 - 2580
  • [5] Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition
    Wu, Jibin
    Yilmaz, Emre
    Zhang, Malu
    Li, Haizhou
    Tan, Kay Chen
    FRONTIERS IN NEUROSCIENCE, 2020, 14
  • [6] EXPLOITING SPARSENESS IN DEEP NEURAL NETWORKS FOR LARGE VOCABULARY SPEECH RECOGNITION
    Yu, Dong
    Seide, Frank
    Li, Gang
    Deng, Li
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4409 - 4412
  • [7] Large Vocabulary Speech Recognition Using Deep Tensor Neural Networks
    Yu, Dong
    Deng, Li
    Seide, Frank
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 6 - 9
  • [8] A CLUSTER-BASED MULTIPLE DEEP NEURAL NETWORKS METHOD FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    Zhou, Pan
    Liu, Cong
    Liu, Qingfeng
    Dai, Lirong
    Jiang, Hui
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6650 - 6654
  • [9] Vietnamese Large Vocabulary Continuous Speech Recognition
    Ngoc Thang Vu
    Schultz, Tanja
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 333 - 338
  • [10] Advances in large vocabulary continuous speech recognition
    Zweig, G
    Picheny, M
    ADVANCES IN COMPUTERS, VOL. 60: INFORMATION SECURITY, 2004, 60 : 249 - 291