Scaled free-energy based reinforcement learning for robust and efficient learning in high-dimensional state spaces

被引:8
|
作者
Elfwing, Stefan [1 ]
Uchibe, Eiji [1 ]
Doya, Kenji [1 ]
机构
[1] Grad Univ, Okinawa Inst Sci & Technol, Neural Computat Unit, Onna Son, Okinawa 9040412, Japan
来源
关键词
reinforcement learning; free-energy; restricted Boltzmann machine; robot navigation; function approximation; SPATIAL COGNITION; NAVIGATION; MODEL;
D O I
10.3389/fnbot.2013.00003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Free energy based reinforcement learning (FERL) was proposed for learning in high-dimensional state- and action spaces, which cannot be handled by standard function approximation methods. In this study, we propose a scaled version of free-energy based reinforcement learning to achieve more robust and more efficient learning performance. The action value function is approximated by the negative free energy of a restricted Boltzmann machine, divided by a constant scaling factor that is related to the size of the Boltzmann machine (the square root of the number of state nodes in this study). Our first task is a digit floor gridworld task, where the states are represented by images of handwritten digits from the MNIST data set. The purpose of the task is to investigate the proposed method's ability, through the extraction of task relevant features in the hidden layer, to cluster images of the same digit and to cluster images of different digits that corresponds to states with the same optimal action. We also test the method's robustness with respect to different exploration schedules, i.e., different settings of the initial temperature and the temperature discount rate in softmax action selection. Our second task is a robot visual navigation task, where the robot can learn its position by the different colors of the lower part of four landmarks and it can infer the correct corner goal area by the color of the upper part of the landmarks. The state space consists of binarized camera images with, at most, nine different colors, which is equal to 6642 binary states. For both tasks, the learning performance is compared with standard FERL and with function approximation where the action-value function is approximated by a two-layered feedforward neural network.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Optimizing high-dimensional stochastic forestry via reinforcement learning
    Tahvonen, Olli
    Suominen, Antti
    Malo, Pekka
    Viitasaari, Lauri
    Parkatti, Vesa-Pekka
    JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 2022, 145
  • [32] Learning and exploiting low-dimensional structure for efficient holonomic motion planning in high-dimensional spaces
    Vernaza, Paul
    Lee, Daniel D.
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2012, 31 (14): : 1739 - 1760
  • [33] A Deep Reinforcement Learning Framework for High-Dimensional Circuit Linearization
    Rong, Chao
    Paramesh, Jeyanandh
    Carley, L. Richard
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (09) : 3665 - 3669
  • [34] Reinforcement Learning on Slow Features of High-Dimensional Input Streams
    Legenstein, Robert
    Wilbert, Niko
    Wiskott, Laurenz
    PLOS COMPUTATIONAL BIOLOGY, 2010, 6 (08)
  • [35] Efficient Weight Learning in High-Dimensional Untied MLNs
    Al Farabi, Khan Mohammad
    Sarkhel, Somdeb
    Venugopal, Deepak
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [36] Efficient Sparse Representation for Learning With High-Dimensional Data
    Chen, Jie
    Yang, Shengxiang
    Wang, Zhu
    Mao, Hua
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) : 4208 - 4222
  • [37] Efficient Representation Learning for High-Dimensional Imbalance Data
    Mirza, Bilal
    Kok, Stanley
    Lin, Zhiping
    Yeo, Yong Kiang
    Lai, Xiaoping
    Cao, Jiuwen
    Sepulveda, Jose
    2016 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2016, : 511 - 515
  • [38] Efficient Learning and Feature Selection in High-Dimensional Regression
    Ting, Jo-Anne
    D'Souza, Aaron
    Vijayakumar, Sethu
    Schaal, Stefan
    NEURAL COMPUTATION, 2010, 22 (04) : 831 - 886
  • [39] Hierarchical reinforcement learning of low-dimensional subgoals and high-dimensional trajectories
    Morimoto, J
    Doya, K
    ICONIP'98: THE FIFTH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING JOINTLY WITH JNNS'98: THE 1998 ANNUAL CONFERENCE OF THE JAPANESE NEURAL NETWORK SOCIETY - PROCEEDINGS, VOLS 1-3, 1998, : 850 - 853
  • [40] Learning in high-dimensional multimedia data: the state of the art
    Gao, Lianli
    Song, Jingkuan
    Liu, Xingyi
    Shao, Junming
    Liu, Jiajun
    Shao, Jie
    MULTIMEDIA SYSTEMS, 2017, 23 (03) : 303 - 313