I2RL: online inverse reinforcement learning under occlusion

被引:0
|
作者
Saurabh Arora
Prashant Doshi
Bikramjit Banerjee
机构
[1] University of Georgia,THINC Lab, Department of Computer Science, 415 Boyd GSRC
[2] University of Southern Mississippi,School of Computing Sciences and Computer Engineering
关键词
Robot learning; Online learning; Robotics; Reinforcement learning; Inverse reinforcement learning;
D O I
暂无
中图分类号
学科分类号
摘要
Inverse reinforcement learning (IRL) is the problem of learning the preferences of an agent from observing its behavior on a task. It inverts RL which focuses on learning an agent’s behavior on a task based on the reward signals received. IRL is witnessing sustained attention due to promising applications in robotics, computer games, and finance, as well as in other sectors. Methods for IRL have, for the most part, focused on batch settings where the observed agent’s behavioral data has already been collected. However, the related problem of online IRL—where observations are incrementally accrued, yet the real-time demands of the application often prohibit a full rerun of an IRL method—has received significantly less attention. We introduce the first formal framework for online IRL, called incremental IRL (I2RL), which can serve as a common ground for online IRL methods. We demonstrate the usefulness of this framework by casting existing online IRL techniques into this framework. Importantly, we present a new method that advances maximum entropy IRL with hidden variables to the online setting. Our analysis shows that the new method has monotonically improving performance with more demonstration data as well as probabilistically bounded error, both under full and partial observability. Simulated and physical robot experiments in a multi-robot patrolling application situated in varied-sized worlds, which involves learning under high levels of occlusion, show a significantly improved performance of I2RL as compared to both batch IRL and an online imitation learning method.
引用
收藏
相关论文
共 50 条
  • [1] I2RL: online inverse reinforcement learning under occlusion
    Arora, Saurabh
    Doshi, Prashant
    Banerjee, Bikramjit
    [J]. AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2021, 35 (01)
  • [2] Online Inverse Reinforcement Learning Under Occlusion
    Arora, Saurabh
    Doshi, Prashant
    Banerjee, Bikramjit
    [J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1170 - 1178
  • [3] Multi-Robot Inverse Reinforcement Learning under Occlusion with Interactions
    Bogert, Kenneth
    Doshi, Prashant
    [J]. AAMAS'14: PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2014, : 173 - 180
  • [4] Multi-robot inverse reinforcement learning under occlusion with estimation of state transitions
    Bogert, Kenneth
    Doshi, Prashant
    [J]. ARTIFICIAL INTELLIGENCE, 2018, 263 : 46 - 73
  • [5] Multi-Robot Inverse Reinforcement Learning Under Occlusion with State Transition Estimation
    Bogert, Kenneth
    Doshi, Prashant
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS (AAMAS'15), 2015, : 1837 - 1838
  • [6] Online inverse reinforcement learning with limited data
    Self, Ryan
    Mahmud, S. M. Nahid
    Hareland, Katrine
    Kamalapurkar, Rushikesh
    [J]. 2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 603 - 608
  • [7] Scaling Expectation-Maximization for Inverse Reinforcement Learning to Multiple Robots under Occlusion
    Bogert, Kenneth
    Doshi, Prashant
    [J]. AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 522 - 529
  • [8] Online inverse reinforcement learning for nonlinear systems
    Self, Ryan
    Harlan, Michael
    Kamalapurkar, Rushikesh
    [J]. 2019 3RD IEEE CONFERENCE ON CONTROL TECHNOLOGY AND APPLICATIONS (IEEE CCTA 2019), 2019, : 296 - 301
  • [9] Online inverse reinforcement learning for systems with disturbances
    Self, Ryan
    Abudia, Moad
    Kamalapurkar, Rushikesh
    [J]. 2020 AMERICAN CONTROL CONFERENCE (ACC), 2020, : 1118 - 1123
  • [10] Generative Inverse Deep Reinforcement Learning for Online Recommendation
    Chen, Xiaocong
    Yao, Lina
    Sun, Aixin
    Wang, Xianzhi
    Xu, Xiwei
    Zhu, Liming
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 201 - 210