Markovian architectural bias of recurrent neural networks

被引:141
|
作者
Tino, P [1 ]
Cernansky, M
Benusková, L
机构
[1] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, W Midlands, England
[2] Slovak Univ Technol Bratislava, Fac Elect Engn & Informat Technol, Bratislava 81219, Slovakia
[3] Comenius Univ, Fac Math Phys & Informat, Bratislava 84248 4, Slovakia
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2004年 / 15卷 / 01期
关键词
complex symbolic sequences; information latching problem; iterative function systems; Markov models; recurrent neural networks (RNNs);
D O I
10.1109/TNN.2003.820839
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we elaborate upon the claim that clustering in the recurrent layer of recurrent neural networks (RNNs) reflects meaningful information processing states even prior to training [1], [2]. By concentrating on activation clusters in RNNs, while not throwing away the continuous state space network dynamics, we extract predictive models that we call neural prediction machines (NPMs). When RNNs with sigmoid activation functions are initialized with small weights (a common technique in the RNN community), the clusters of recurrent activations emerging prior to training are indeed meaningful and correspond to Markov prediction contexts. In this case, the extracted NPMs correspond to a class of Markov models, called variable memory length Markov models (VLMMs). In order to appreciate how much information has really been induced during the training, the RNN performance should always be compared with that of VLMMs and NPMs extracted before training as the "null" base models. Our arguments are supported by experiments on a chaotic symbolic sequence and a context-free language with a deep recursive structure.
引用
收藏
页码:6 / 15
页数:10
相关论文
共 50 条
  • [1] Markovian architectural bias of recurrent neural networks
    Tiño, P
    Cernansky, M
    Benuskova, L
    [J]. INTELLIGENT TECHNOLOGIES - THEORY AND APPLICATIONS: NEW TRENDS IN INTELLIGENT TECHNOLOGIES, 2002, 76 : 17 - 23
  • [2] Approaches based on Markovian architectural bias in recurrent neural networks
    Makula, M
    Cernansky, M
    Benusková, L
    [J]. SOFSEM 2004: THEORY AND PRACTICE OF COMPUTER SCIENCE, PROCEEDINGS, 2004, 2932 : 257 - 264
  • [3] Architectural bias in recurrent neural networks -: Fractal analysis
    Tiño, P
    Hammer, B
    [J]. ARTIFICIAL NEURAL NETWORKS - ICANN 2002, 2002, 2415 : 1359 - 1364
  • [4] Architectural bias in recurrent neural networks: Fractal analysis
    Tino, P
    Hammer, B
    [J]. NEURAL COMPUTATION, 2003, 15 (08) : 1931 - 1957
  • [5] Architectural Complexity Measures of Recurrent Neural Networks
    Zhang, Saizheng
    Wu, Yuhuai
    Che, Tong
    Lin, Zhouhan
    Memisevic, Roland
    Salakhutdinov, Ruslan
    Bengio, Yoshua
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [6] Utilizing bias to evolve recurrent neural networks
    de Jong, ED
    Pollack, JB
    [J]. IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 2667 - 2672
  • [7] Optimizing Markovian modeling of chaotic systems with recurrent neural networks
    Cechin, Adelmo L.
    Pechmann, Denise R.
    de Oliveira, Luiz P. L.
    [J]. CHAOS SOLITONS & FRACTALS, 2008, 37 (05) : 1317 - 1327
  • [8] Improved access to sequential motifs:: A note on the architectural bias of recurrent networks
    Bodén, M
    Hawkins, J
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2005, 16 (02): : 491 - 494
  • [9] Stability in distribution of stochastic delay recurrent neural networks with Markovian switching
    Enwen Zhu
    George Yin
    Quan Yuan
    [J]. Neural Computing and Applications, 2016, 27 : 2141 - 2151
  • [10] Stability Analysis of Recurrent Neural Networks with Random Delay and Markovian Switching
    Enwen Zhu
    Yong Wang
    Yueheng Wang
    Hanjun Zhang
    Jiezhong Zou
    [J]. Journal of Inequalities and Applications, 2010