Vision-Based Reinforcement Learning using Approximate Policy Iteration

被引:0
|
作者
Shaker, Marwan R. [1 ]
Yue, Shigang [1 ]
Duckett, Tom [1 ]
机构
[1] Lincoln Univ, Dept Comp & Informat, Lincoln LN6 7TS, England
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A major issue for reinforcement learning (RL) applied to robotics is the time required to learn a new skill. While RL has been used to learn mobile robot control in many simulated domains, applications involving learning on real robots are still relatively rare. In this paper, the Least-Squares Policy Iteration (LSPI) reinforcement learning algorithm and a new model-based algorithm Least-Squares Policy Iteration with Prioritized Sweeping (LSPI+), are implemented on a mobile robot to acquire new skills quickly and efficiently. LSPI+ combines the benefits of LSPI and prioritized sweeping, which uses all previous experience to focus the computational effort on the most "interesting" or dynamic parts of the state space. The proposed algorithms are tested on a household vacuum cleaner robot for learning a docking task using vision as the only sensor modality. In experiments these algorithms are compared to other model-based and model-free RL algorithms. The results show that the number of trials required to learn the docking task is significantly reduced using LSPI compared to the other RL algorithms investigated, and that LSPI+ further improves on the performance of LSPI.
引用
收藏
页码:594 / 599
页数:6
相关论文
共 50 条
  • [21] Vision-based Navigation of UAV with Continuous Action Space Using Deep Reinforcement Learning
    Zhou, Benchun
    Wang, Weihong
    Liu, Zhenghua
    Wang, Jia
    [J]. PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 5030 - 5035
  • [22] Augmenting Vision-Based Grasp Plans for Soft Robotic Grippers using Reinforcement Learning
    Vatsal, Vighnesh
    George, Nijil
    [J]. 2022 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2022, : 1904 - 1909
  • [23] Preference-Based Policy Iteration: Leveraging Preference Learning for Reinforcement Learning
    Cheng, Weiwei
    Fuernkranz, Johannes
    Huellermeier, Eyke
    Park, Sang-Hyeun
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2011, 6911 : 312 - 327
  • [24] Approximate Policy Iteration with Unsupervised Feature Learning based on Manifold Regularization
    Li, Hongliang
    Liu, Derong
    Wang, Ding
    [J]. 2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [25] Multiagent Reinforcement Learning:Rollout and Policy Iteration
    Dimitri Bertsekas
    [J]. IEEE/CAA Journal of Automatica Sinica, 2021, 8 (02) : 249 - 272
  • [26] Vision-Based Mobile Robotics Obstacle Avoidance With Deep Reinforcement Learning
    Wenzel, Patrick
    Schoen, Torsten
    Leal-Taixe, Laura
    Cremers, Daniel
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 14360 - 14366
  • [27] Quantum reinforcement learning via policy iteration
    El Amine Cherrat
    Iordanis Kerenidis
    Anupam Prakash
    [J]. Quantum Machine Intelligence, 2023, 5
  • [28] Purposive behavior acquisition for a real robot by vision-based reinforcement learning
    Asada, M
    Noda, S
    Tawaratsumida, S
    Hosoda, K
    [J]. MACHINE LEARNING, 1996, 23 (2-3) : 279 - 303
  • [29] Online Evolution of Deep Convolutional Network for Vision-Based Reinforcement Learning
    Koutnik, Jan
    Schmidhuber, Juergen
    Gomez, Faustino
    [J]. FROM ANIMALS TO ANIMATS 13, 2014, 8575 : 260 - 269
  • [30] Multiagent Reinforcement Learning: Rollout and Policy Iteration
    Bertsekas, Dimitri
    [J]. IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2021, 8 (02) : 249 - 272