Scaled free-energy based reinforcement learning for robust and efficient learning in high-dimensional state spaces

被引:8
|
作者
Elfwing, Stefan [1 ]
Uchibe, Eiji [1 ]
Doya, Kenji [1 ]
机构
[1] Grad Univ, Okinawa Inst Sci & Technol, Neural Computat Unit, Onna Son, Okinawa 9040412, Japan
来源
关键词
reinforcement learning; free-energy; restricted Boltzmann machine; robot navigation; function approximation; SPATIAL COGNITION; NAVIGATION; MODEL;
D O I
10.3389/fnbot.2013.00003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Free energy based reinforcement learning (FERL) was proposed for learning in high-dimensional state- and action spaces, which cannot be handled by standard function approximation methods. In this study, we propose a scaled version of free-energy based reinforcement learning to achieve more robust and more efficient learning performance. The action value function is approximated by the negative free energy of a restricted Boltzmann machine, divided by a constant scaling factor that is related to the size of the Boltzmann machine (the square root of the number of state nodes in this study). Our first task is a digit floor gridworld task, where the states are represented by images of handwritten digits from the MNIST data set. The purpose of the task is to investigate the proposed method's ability, through the extraction of task relevant features in the hidden layer, to cluster images of the same digit and to cluster images of different digits that corresponds to states with the same optimal action. We also test the method's robustness with respect to different exploration schedules, i.e., different settings of the initial temperature and the temperature discount rate in softmax action selection. Our second task is a robot visual navigation task, where the robot can learn its position by the different colors of the lower part of four landmarks and it can infer the correct corner goal area by the color of the upper part of the landmarks. The state space consists of binarized camera images with, at most, nine different colors, which is equal to 6642 binary states. For both tasks, the learning performance is compared with standard FERL and with function approximation where the action-value function is approximated by a two-layered feedforward neural network.
引用
收藏
页数:10
相关论文
共 50 条
  • [11] Continuous Control for High-Dimensional State Spaces: An Interactive Learning Approach
    Perez-Dattari, Rodrigo
    Celemin, Carlos
    Ruiz-del-Solar, Javier
    Kober, Jens
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 7611 - 7617
  • [12] Robust Methods for High-Dimensional Linear Learning
    Merad, Ibrahim
    Gaiffas, Stephane
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [13] Distributionally Robust Model-based Reinforcement Learning with Large State Spaces
    Ramesh, Shyam Sundhar
    Sessa, Pier Giuseppe
    Hu, Yifan
    Krause, Andreas
    Bogunovic, Ilija
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [14] Robust Population Coding in Free-Energy-Based Reinforcement Learning
    Otsuka, Makoto
    Yoshimoto, Junichiro
    Doya, Kenji
    ARTIFICIAL NEURAL NETWORKS - ICANN 2008, PT I, 2008, 5163 : 377 - 386
  • [15] Reinforcement learning for high-dimensional problems with symmetrical actions
    Kamal, MAS
    Murata, J
    2004 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN & CYBERNETICS, VOLS 1-7, 2004, : 6192 - 6197
  • [16] Offline reinforcement learning in high-dimensional stochastic environments
    Félicien Hêche
    Oussama Barakat
    Thibaut Desmettre
    Tania Marx
    Stephan Robert-Nicoud
    Neural Computing and Applications, 2024, 36 : 585 - 598
  • [17] Challenges in High-Dimensional Reinforcement Learning with Evolution Strategies
    Mueller, Nils
    Glasmachers, Tobias
    PARALLEL PROBLEM SOLVING FROM NATURE - PPSN XV, PT II, 2018, 11102 : 411 - 423
  • [18] Offline reinforcement learning in high-dimensional stochastic environments
    Heche, Felicien
    Barakat, Oussama
    Desmettre, Thibaut
    Marx, Tania
    Robert-Nicoud, Stephan
    NEURAL COMPUTING & APPLICATIONS, 2023, 36 (2): : 585 - 598
  • [19] Emergent Solutions to High-Dimensional Multitask Reinforcement Learning
    Kelly, Stephen
    Heywood, Malcolm, I
    EVOLUTIONARY COMPUTATION, 2018, 26 (03) : 347 - 380
  • [20] Towards Learning Abstract Representations for Locomotion Planning in High-dimensional State Spaces
    Klamt, Tobias
    Behnke, Sven
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 922 - 928