Smoothed Sarsa: Reinforcement Learning for Robot Delivery Tasks

被引:0
|
作者
Ramachandran, Deepak [1 ]
Gupta, Rakesh [2 ]
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
[2] Honda Res Inst USA Inc, Mountain View, CA 94041 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Our goal in this work is to make high level decisions for mobile robots. In particular, given a queue of prioritized object delivery tasks, we wish to find a sequence of actions in real time to accomplish these tasks efficiently. We introduce a novel reinforcement learning algorithm called Smoothed Sarsa that learns a good policy for these delivery tasks by delaying the backup reinforcement step until the uncertainty in the state estimate improves. The state space is modeled by a Dynamic Bayesian Network and updated using a Region-based Particle Filter. We take advantage of the fact that only discrete (topological) representations of entity locations are needed for decision-making, to make the tracking and decision making more efficient. Our experiments show that policy search leads to faster task completion times as well as higher total reward compared to a manually crafted policy. Smoothed Sarsa learns a policy orders of magnitude faster than previous policy search algorithms. We demonstrate our results on the Player/Stage simulator and on the Pioneer robot.
引用
收藏
页码:3327 / +
页数:3
相关论文
共 50 条
  • [21] User-guided reinforcement learning of robot assistive tasks for an intelligent environment
    Wang, Y
    Huber, M
    Papudesi, VN
    Cook, DJ
    IROS 2003: PROCEEDINGS OF THE 2003 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4, 2003, : 424 - 429
  • [22] An Improved Deep Reinforcement Learning Approach for Medical Delivery Robot Navigation
    Wang, Minghui
    Zeng, Bi
    Zhao, Rui
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2020, 127 : 23 - 23
  • [23] Reward Certification for Policy Smoothed Reinforcement Learning
    Mu, Ronghui
    Marcolino, Leandro Soriano
    Zhang, Yanghao
    Zhang, Tianle
    Huang, Xiaowei
    Ruan, Wenjie
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 19, 2024, : 21429 - 21437
  • [24] SARSA-based reinforcement learning for motion planning in Serial Manipulators
    Aleo, Ignazio
    Arena, Paolo
    Patane, Luca
    2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [25] Two Steps Reinforcement Learning in Continuous Reinforcement Learning Tasks
    Lopez-Bueno, Ivan
    Garcia, Javier
    Fernandez, Fernando
    BIO-INSPIRED SYSTEMS: COMPUTATIONAL AND AMBIENT INTELLIGENCE, PT 1, 2009, 5517 : 577 - 584
  • [26] SWIRL: A sequential windowed inverse reinforcement learning algorithm for robot tasks with delayed rewards
    Krishnan, Sanjay
    Garg, Animesh
    Liaw, Richard
    Thananjeyan, Brijen
    Miller, Lauren
    Pokorny, Florian T.
    Goldberg, Ken
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2019, 38 (2-3): : 126 - 145
  • [27] Analysis of Space Manipulator Route Planning Based on Sarsa (λ) Reinforcement Learning
    Xu
    Lu S.
    Yuhang Xuebao/Journal of Astronautics, 2019, 40 (04): : 435 - 443
  • [28] Model Predictive Control-Based Reinforcement Learning Using Expected Sarsa
    Moradimaryamnegari, Hoomaan
    Frego, Marco
    Peer, Angelika
    IEEE ACCESS, 2022, 10 : 81177 - 81191
  • [29] A Sarsa reinforcement learning hybrid ensemble method for robotic battery power forecasting
    Peng, Fei
    Liu, Hui
    Zheng, Li
    JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2023, 30 (11) : 3867 - 3880
  • [30] Safe Reinforcement Learning for Single Train Trajectory Optimization via Shield SARSA
    Zhao, Zicong
    Xun, Jing
    Wen, Xuguang
    Chen, Jianqiu
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (01) : 412 - 428