BAYESIAN AND GAUSSIAN PROCESS NEURAL NETWORKS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION

被引:0
|
作者
Hu, Shoukang [1 ]
Lam, Max W. Y. [1 ]
Xie, Xurong [1 ]
Liu, Shansong [1 ]
Yu, Jianwei [1 ]
Wu, Xixin [1 ]
Liu, Xunying [1 ]
Meng, Helen [1 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
来源
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年
关键词
Bayesian Neural Network; Gaussian Process Neural Network; activation function selection; speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The hidden activation functions inside deep neural networks ( DNNs) play a vital role in learning high level discriminative features and controlling the information flows to track longer history. However, the fixed model parameters used in standard DNNs can lead to over-fitting and poor generalization when given limited training data. Furthermore, the precise forms of activations used in DNNs are often manually set at a global level for all hidden nodes, thus lacking an automatic selection method. In order to address these issues, Bayesian neural networks ( BNNs) acoustic models are proposed in this paper to explicitly model the uncertainty associated with DNN parameters. Gaussian Process ( GP) activations based DNN and LSTM acoustic models are also used in this paper to allow the optimal forms of hidden activations to be stochastically learned for individual hidden nodes. An efficient variational inference based training algorithm is derived for BNN, GPNN and GPLSTM systems. Experiments were conducted on a LVCSR system trained on a 75 hour subset of Switchboard I data. The best BNN and GPNN systems outperformed both the baseline DNN systems constructed using fixed form activations and their combination via frame level joint decoding by 1% absolute in word error rate.
引用
收藏
页码:6555 / 6559
页数:5
相关论文
共 50 条
  • [41] Size matters: An empirical study of neural network training for large vocabulary continuous speech recognition
    Ellis, D
    Morgan, N
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 1013 - 1016
  • [42] Using a transcription graph for large vocabulary continuous speech recognition
    Li, Z
    OShaughnessy, D
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 121 - 124
  • [43] DISTRIBUTED SUBMODULAR MAXIMIZATION FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    Qi, Jun
    Liu, Xu
    Kamijo, Shunshuke
    Tejedor, Javier
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2501 - 2505
  • [44] A word graph algorithm for large vocabulary continuous speech recognition
    Ortmanns, S
    Ney, H
    Aubert, X
    COMPUTER SPEECH AND LANGUAGE, 1997, 11 (01): : 43 - 72
  • [45] A large vocabulary continuous speech recognition system for Persian language
    Hossein Sameti
    Hadi Veisi
    Mohammad Bahrani
    Bagher Babaali
    Khosro Hosseinzadeh
    EURASIP Journal on Audio, Speech, and Music Processing, 2011
  • [46] Large Vocabulary Continuous Audio-Visual Speech Recognition
    Sterpu, George
    ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, : 538 - 541
  • [47] On designing pronunciation lexicons for large vocabulary, continuous speech recognition
    Lamel, L
    Adda, G
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 6 - 9
  • [48] DEEP-FSMN FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    Zhang, Shiliang
    Lei, Ming
    Yan, Zhijie
    Dai, Lirong
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5869 - 5873
  • [49] Large-vocabulary continuous speech recognition: Advances and applications
    Gauvain, JL
    Lamel, L
    PROCEEDINGS OF THE IEEE, 2000, 88 (08) : 1181 - 1200
  • [50] A large-vocabulary continuous speech recognition system for Hindi
    Kumar, M
    Rajput, N
    Verma, A
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2004, 48 (5-6) : 703 - 715