Sparse Bayesian Recurrent Neural Networks

被引:5
|
作者
Chatzis, Sotirios P. [1 ]
机构
[1] Cyprus Univ Technol, Dept Elect Engn Comp Engn & Informat, CY-3036 Limassol, Cyprus
关键词
D O I
10.1007/978-3-319-23525-7_22
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recurrent neural networks (RNNs) have recently gained renewed attention from the machine learning community as effective methods for modeling variable-length sequences. Language modeling, handwriting recognition, and speech recognition are only few of the application domains where RNN-based models have achieved the state-ofthe- art performance currently reported in the literature. Typically, RNN architectures utilize simple linear, logistic, or softmax output layers to perform data modeling and prediction generation. In this work, for the first time in the literature, we consider using a sparse Bayesian regression or classification model as the output layer of RNNs, inspired from the automatic relevance determination (ARD) technique. The notion of ARD is to continually create new components while detecting when a component starts to overfit, where overfit manifests itself as a precision hyperparameter posterior tending to infinity. This way, our method manages to train sparse RNN models, where the number of effective ("active") recurrently connected hidden units is selected in a data-driven fashion, as part of the model inference procedure. We develop efficient and scalable training algorithms for our model under the stochastic variational inference paradigm, and derive elegant predictive density expressions with computational costs comparable to conventional RNN formulations. We evaluate our approach considering its application to challenging tasks dealing with both regression and classification problems, and exhibit its favorable performance over the state-of-the-art.
引用
收藏
页码:359 / 372
页数:14
相关论文
共 50 条
  • [1] BAYESIAN NEURAL NETWORKS FOR SPARSE CODING
    Kuzin, Danil
    Isupova, Olga
    Mihaylova, Lyudmila
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2992 - 2996
  • [2] Dynamic reconstruction from noise contaminated data with sparse Bayesian recurrent neural networks
    Mirikitani, Derrick T.
    Park, Incheon
    Daoudi, Mohammed
    [J]. AMS 2007: FIRST ASIA INTERNATIONAL CONFERENCE ON MODELLING & SIMULATION ASIA MODELLING SYMPOSIUM, PROCEEDINGS, 2007, : 409 - +
  • [3] Recognizing recurrent neural networks (rRNN): Bayesian inference for recurrent neural networks
    Sebastian Bitzer
    Stefan J. Kiebel
    [J]. Biological Cybernetics, 2012, 106 : 201 - 217
  • [4] Recognizing recurrent neural networks (rRNN): Bayesian inference for recurrent neural networks
    Bitzer, Sebastian
    Kiebel, Stefan J.
    [J]. BIOLOGICAL CYBERNETICS, 2012, 106 (4-5) : 201 - 217
  • [5] Bayesian learning for recurrent neural networks
    Crucianu, M
    Boné, R
    de Beauville, JPA
    [J]. NEUROCOMPUTING, 2001, 36 (01) : 235 - 242
  • [6] Recurrent Bayesian reasoning in probabilistic neural networks
    Grim, Jiri
    Hora, Jan
    [J]. ARTIFICIAL NEURAL NETWORKS - ICANN 2007, PT 1, PROCEEDINGS, 2007, 4668 : 129 - +
  • [7] Efficient and effective training of sparse recurrent neural networks
    Shiwei Liu
    Iftitahu Ni’mah
    Vlado Menkovski
    Decebal Constantin Mocanu
    Mykola Pechenizkiy
    [J]. Neural Computing and Applications, 2021, 33 : 9625 - 9636
  • [8] Universal structural patterns in sparse recurrent neural networks
    Xin-Jie Zhang
    Jack Murdoch Moore
    Gang Yan
    Xiang Li
    [J]. Communications Physics, 6
  • [9] Efficient and effective training of sparse recurrent neural networks
    Liu, Shiwei
    Ni'mah, Iftitahu
    Menkovski, Vlado
    Mocanu, Decebal Constantin
    Pechenizkiy, Mykola
    [J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (15): : 9625 - 9636
  • [10] Universal structural patterns in sparse recurrent neural networks
    Zhang, Xin-Jie
    Moore, Jack Murdoch
    Yan, Gang
    Li, Xiang
    [J]. COMMUNICATIONS PHYSICS, 2023, 6 (01)