Autonomous reinforcement learning with experience replay

被引:38
|
作者
Wawrzynski, Pawel [1 ]
Tanwani, Ajay Kumar [1 ,2 ]
机构
[1] Warsaw Univ Technol, Inst Control & Computat Engn, Warsaw, Poland
[2] Ecole Polytech Fed Lausanne, CH-1015 Lausanne, Switzerland
关键词
Reinforcement learning; Autonomous learning; Step-size estimation; Actor-critic; ACTOR-CRITIC ALGORITHMS; RATE ADAPTATION; ENVIRONMENTS; CONVERGENCE; NETWORKS;
D O I
10.1016/j.neunet.2012.11.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper considers the issues of efficiency and autonomy that are required to make reinforcement learning suitable for real-life control tasks. A real-time reinforcement learning algorithm is presented that repeatedly adjusts the control policy with the use of previously collected samples, and autonomously estimates the appropriate step-sizes for the learning updates. The algorithm is based on the actor-critic with experience replay whose step-sizes are determined on-line by an enhanced fixed point algorithm for on-line neural network training. An experimental study with simulated octopus arm and half-cheetah demonstrates the feasibility of the proposed algorithm to solve difficult learning control problems in an autonomous way within reasonably short time. (c) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:156 / 167
页数:12
相关论文
共 50 条
  • [1] Autonomous Reinforcement Learning with Experience Replay for Humanoid Gait Optimization
    Wawrzynski, Pawel
    [J]. PROCEEDINGS OF THE INTERNATIONAL NEURAL NETWORK SOCIETY WINTER CONFERENCE (INNS-WC2012), 2012, 13 : 205 - 211
  • [2] Continuous Reinforcement Learning From Human Demonstrations With Integrated Experience Replay for Autonomous Driving
    Zuo, Sixiang
    Wang, Zhiyang
    Zhu, Xiaorui
    Ou, Yongsheng
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (IEEE ROBIO 2017), 2017, : 2450 - 2455
  • [3] SELECTIVE EXPERIENCE REPLAY IN REINFORCEMENT LEARNING FOR REIDENTIFICATION
    Thakoor, Ninad
    Bhanu, Bir
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 4250 - 4254
  • [4] Prioritized experience replay based reinforcement learning for adaptive tracking control of autonomous underwater vehicle
    Li, Ting
    Yang, Dongsheng
    Xie, Xiangpeng
    [J]. APPLIED MATHEMATICS AND COMPUTATION, 2023, 443
  • [5] Efficient experience replay architecture for offline reinforcement learning
    Zhang, Longfei
    Feng, Yanghe
    Wang, Rongxiao
    Xu, Yue
    Xu, Naifu
    Liu, Zeyi
    Du, Hang
    [J]. ROBOTIC INTELLIGENCE AND AUTOMATION, 2023, 43 (01): : 35 - 43
  • [6] Clustering experience replay for the effective exploitation in reinforcement learning
    Li, Min
    Huang, Tianyi
    Zhu, William
    [J]. PATTERN RECOGNITION, 2022, 131
  • [7] Deep Reinforcement Learning with Experience Replay Based on SARSA
    Zhao, Dongbin
    Wang, Haitao
    Shao, Kun
    Zhu, Yuanheng
    [J]. PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
  • [8] Multimodal fusion for autonomous navigation via deep reinforcement learning with sparse rewards and hindsight experience replay
    Xiao, Wendong
    Yuan, Liang
    Ran, Teng
    He, Li
    Zhang, Jianbo
    Cui, Jianping
    [J]. DISPLAYS, 2023, 78
  • [9] Multi-Input Autonomous Driving Based on Deep Reinforcement Learning With Double Bias Experience Replay
    Cui, Jianping
    Yuan, Liang
    He, Li
    Xiao, Wendong
    Ran, Teng
    Zhang, Jianbo
    [J]. IEEE SENSORS JOURNAL, 2023, 23 (11) : 11253 - 11261
  • [10] Experience Replay for Real-Time Reinforcement Learning Control
    Adam, Sander
    Busoniu, Lucian
    Babuska, Robert
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (02): : 201 - 212