Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks

被引:7
|
作者
Han, Dongqi [1 ]
Doya, Kenji [2 ]
Tani, Jun [1 ]
机构
[1] Okinawa Inst Sci & Technol, Cognit Neurorobot Res Unit, Okinawa, Japan
[2] Okinawa Inst Sci & Technol, Neural Computat Unit, Okinawa, Japan
基金
日本学术振兴会;
关键词
Recurrent neural network; Reinforcement learning; Partially observable Markov decision process; Multiple timescale; Compositionality; TIME SCALES; TIMESCALES; MEMORY; GAME; GO;
D O I
10.1016/j.neunet.2020.06.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recurrent neural networks (RNNs) for reinforcement learning (RL) have shown distinct advantages, e.g., solving memory-dependent tasks and meta-learning. However, little effort has been spent on improving RNN architectures and on understanding the underlying neural mechanisms for performance gain. In this paper, we propose a novel, multiple-timescale, stochastic RNN for RL. Empirical results show that the network can autonomously learn to abstract sub-goals and can self-develop an action hierarchy using internal dynamics in a challenging continuous control task. Furthermore, we show that the self-developed compositionality of the network enhances faster re-learning when adapting to a new task that is a re-composition of previously learned sub-goals, than when starting from scratch. We also found that improved performance can be achieved when neural activities are subject to stochastic rather than deterministic dynamics. (C) 2020 The Authors. Published by Elsevier Ltd.
引用
收藏
页码:149 / 162
页数:14
相关论文
共 50 条
  • [21] PATTERN FORMATION, CODING, AND SELF-ORGANIZATION IN NEURAL NETWORKS
    GROSSBERG, S
    [J]. NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY, 1976, 23 (03): : A383 - A383
  • [22] SELF-ORGANIZATION IN NEURAL NETWORKS UNDERLYING THE COORDINATION OF MOVEMENTS
    GIELEN, CCAM
    COOLEN, ACC
    [J]. NEURAL NETWORKS FROM MODELS TO APPLICATIONS, 1989, : 78 - 87
  • [23] Moderatism: New concept for self-organization of neural networks
    Okabe, Y
    Kouhara, T
    Hayashi, H
    Narusawa, A
    Kitagawa, M
    Miyao, M
    [J]. 1998 SECOND INTERNATIONAL CONFERENCE ON KNOWLEDGE-BASED INTELLIGENT ELECTRONIC SYSTEMS, KES '98, PROCEEDINGS, VOL, 3, 1998, : 246 - 249
  • [24] Neural self-organization for the packet scheduling in wireless networks
    Badia, L
    Boaretto, M
    Zorzi, M
    [J]. 2004 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, VOLS 1-4: BROADBAND WIRELESS - THE TIME IS NOW, 2004, : 1927 - 1932
  • [25] Self-organization versus hierarchy in open-source social networks
    Valverde, Sergi
    Sole, Ricard V.
    [J]. PHYSICAL REVIEW E, 2007, 76 (04)
  • [26] Learning Human Motion Feedback with Neural Self-Organization
    Parisi, German I.
    von Stosch, Florian
    Magg, Sven
    Wermter, Stefan
    [J]. 2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [27] Reinforcement Learning via Recurrent Convolutional Neural Networks
    Shankar, Tanmay
    Dwivedy, Santosha K.
    Guha, Prithwijit
    [J]. 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 2592 - 2597
  • [28] Self-Organization of Spatio-Temporal Hierarchy via Learning of Dynamic Visual Image Patterns on Action Sequences
    Jung, Minju
    Hwang, Jungsik
    Tani, Jun
    [J]. PLOS ONE, 2015, 10 (07):
  • [29] The self-organization of intentional action
    Juarrero, A
    [J]. REVUE INTERNATIONALE DE PHILOSOPHIE, 2004, 58 (228) : 189 - 204
  • [30] Color stereo matching based on self-organization neural networks
    Hua, XJ
    Yokomichi, M
    Kono, M
    [J]. 2004 47TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL I, CONFERENCE PROCEEDINGS, 2004, : 213 - 216