Vision-Based Reinforcement Learning using Approximate Policy Iteration

被引：0

作者：

Shaker, Marwan R. ^{[1
]}

Yue, Shigang ^{[1
]}

Duckett, Tom ^{[1
]}

机构：

[1] Lincoln Univ, Dept Comp & Informat, Lincoln LN6 7TS, England

来源：

ICAR: 2009 14TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS, VOLS 1 AND 2 | 2009年

关键词：

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

A major issue for reinforcement learning (RL) applied to robotics is the time required to learn a new skill. While RL has been used to learn mobile robot control in many simulated domains, applications involving learning on real robots are still relatively rare. In this paper, the Least-Squares Policy Iteration (LSPI) reinforcement learning algorithm and a new model-based algorithm Least-Squares Policy Iteration with Prioritized Sweeping (LSPI+), are implemented on a mobile robot to acquire new skills quickly and efficiently. LSPI+ combines the benefits of LSPI and prioritized sweeping, which uses all previous experience to focus the computational effort on the most "interesting" or dynamic parts of the state space. The proposed algorithms are tested on a household vacuum cleaner robot for learning a docking task using vision as the only sensor modality. In experiments these algorithms are compared to other model-based and model-free RL algorithms. The results show that the number of trials required to learn the docking task is significantly reduced using LSPI compared to the other RL algorithms investigated, and that LSPI+ further improves on the performance of LSPI.

引用

页码：594 / 599

页数：6

共 50 条

[21] Vision-based Navigation of UAV with Continuous Action Space Using Deep Reinforcement Learning
Zhou, Benchun
Wang, Weihong
Liu, Zhenghua
Wang, Jia
[J]. PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 5030 - 5035
[22] Augmenting Vision-Based Grasp Plans for Soft Robotic Grippers using Reinforcement Learning
Vatsal, Vighnesh
George, Nijil
[J]. 2022 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2022, : 1904 - 1909
[23] Preference-Based Policy Iteration: Leveraging Preference Learning for Reinforcement Learning
Cheng, Weiwei
Fuernkranz, Johannes
Huellermeier, Eyke
Park, Sang-Hyeun
[J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2011, 6911 : 312 - 327
[24] Approximate Policy Iteration with Unsupervised Feature Learning based on Manifold Regularization
Li, Hongliang
Liu, Derong
Wang, Ding
[J]. 2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
[25] Multiagent Reinforcement Learning:Rollout and Policy Iteration
Dimitri Bertsekas
[J]. IEEE/CAA Journal of Automatica Sinica, 2021, 8 (02) : 249 - 272
[26] Vision-Based Mobile Robotics Obstacle Avoidance With Deep Reinforcement Learning
Wenzel, Patrick
Schoen, Torsten
Leal-Taixe, Laura
Cremers, Daniel
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 14360 - 14366
[27] Quantum reinforcement learning via policy iteration
El Amine Cherrat
Iordanis Kerenidis
Anupam Prakash
[J]. Quantum Machine Intelligence, 2023, 5
[28] Purposive behavior acquisition for a real robot by vision-based reinforcement learning
Asada, M
Noda, S
Tawaratsumida, S
Hosoda, K
[J]. MACHINE LEARNING, 1996, 23 (2-3) : 279 - 303
[29] Online Evolution of Deep Convolutional Network for Vision-Based Reinforcement Learning
Koutnik, Jan
Schmidhuber, Juergen
Gomez, Faustino
[J]. FROM ANIMALS TO ANIMATS 13, 2014, 8575 : 260 - 269
[30] Multiagent Reinforcement Learning: Rollout and Policy Iteration
Bertsekas, Dimitri
[J]. IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2021, 8 (02) : 249 - 272

← 1 2 3 4 5 →