Markovian architectural bias of recurrent neural networks

被引：141

作者：

Tino, P ^{[1
]}

Cernansky, M

Benusková, L

机构：

[1] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, W Midlands, England

[2] Slovak Univ Technol Bratislava, Fac Elect Engn & Informat Technol, Bratislava 81219, Slovakia

[3] Comenius Univ, Fac Math Phys & Informat, Bratislava 84248 4, Slovakia

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS | 2004年 / 15卷 / 01期

关键词：

complex symbolic sequences; information latching problem; iterative function systems; Markov models; recurrent neural networks (RNNs);

D O I：

10.1109/TNN.2003.820839

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we elaborate upon the claim that clustering in the recurrent layer of recurrent neural networks (RNNs) reflects meaningful information processing states even prior to training [1], [2]. By concentrating on activation clusters in RNNs, while not throwing away the continuous state space network dynamics, we extract predictive models that we call neural prediction machines (NPMs). When RNNs with sigmoid activation functions are initialized with small weights (a common technique in the RNN community), the clusters of recurrent activations emerging prior to training are indeed meaningful and correspond to Markov prediction contexts. In this case, the extracted NPMs correspond to a class of Markov models, called variable memory length Markov models (VLMMs). In order to appreciate how much information has really been induced during the training, the RNN performance should always be compared with that of VLMMs and NPMs extracted before training as the "null" base models. Our arguments are supported by experiments on a chaotic symbolic sequence and a context-free language with a deep recursive structure.

引用

页码：6 / 15

页数：10

共 50 条

[1] Markovian architectural bias of recurrent neural networks
Tiño, P
Cernansky, M
Benuskova, L
[J]. INTELLIGENT TECHNOLOGIES - THEORY AND APPLICATIONS: NEW TRENDS IN INTELLIGENT TECHNOLOGIES, 2002, 76 : 17 - 23
[2] Approaches based on Markovian architectural bias in recurrent neural networks
Makula, M
Cernansky, M
Benusková, L
[J]. SOFSEM 2004: THEORY AND PRACTICE OF COMPUTER SCIENCE, PROCEEDINGS, 2004, 2932 : 257 - 264
[3] Architectural bias in recurrent neural networks -: Fractal analysis
Tiño, P
Hammer, B
[J]. ARTIFICIAL NEURAL NETWORKS - ICANN 2002, 2002, 2415 : 1359 - 1364
[4] Architectural bias in recurrent neural networks: Fractal analysis
Tino, P
Hammer, B
[J]. NEURAL COMPUTATION, 2003, 15 (08) : 1931 - 1957
[5] Architectural Complexity Measures of Recurrent Neural Networks
Zhang, Saizheng
Wu, Yuhuai
Che, Tong
Lin, Zhouhan
Memisevic, Roland
Salakhutdinov, Ruslan
Bengio, Yoshua
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[6] Utilizing bias to evolve recurrent neural networks
de Jong, ED
Pollack, JB
[J]. IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 2667 - 2672
[7] Optimizing Markovian modeling of chaotic systems with recurrent neural networks
Cechin, Adelmo L.
Pechmann, Denise R.
de Oliveira, Luiz P. L.
[J]. CHAOS SOLITONS & FRACTALS, 2008, 37 (05) : 1317 - 1327
[8] Improved access to sequential motifs:: A note on the architectural bias of recurrent networks
Bodén, M
Hawkins, J
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2005, 16 (02): : 491 - 494
[9] Stability in distribution of stochastic delay recurrent neural networks with Markovian switching
Enwen Zhu
George Yin
Quan Yuan
[J]. Neural Computing and Applications, 2016, 27 : 2141 - 2151
[10] Stability Analysis of Recurrent Neural Networks with Random Delay and Markovian Switching
Enwen Zhu
Yong Wang
Yueheng Wang
Hanjun Zhang
Jiezhong Zou
[J]. Journal of Inequalities and Applications, 2010

← 1 2 3 4 5 →