Online Inverse Reinforcement Learning Under Occlusion

被引:0
|
作者
Arora, Saurabh [1 ]
Doshi, Prashant [1 ]
Banerjee, Bikramjit [2 ]
机构
[1] Univ Georgia, Dept Comp Sci, THINC Lab, Athens, GA 30602 USA
[2] Univ Southern Mississippi, Sch Comp Sci & Comp Engn, Hattiesburg, MS 39406 USA
基金
美国国家科学基金会;
关键词
Robot Learning; Online Learning; Robotics; Reinforcement Learning; Inverse Reinforcement Learning;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Inverse reinforcement learning (IRL) is the problem of learning the preferences of an agent from observing its behavior on a task. While this problem is witnessing sustained attention, the related problem of online IRL where the observations are incrementally accrued, yet the real-time demands of the application often prohibit a full rerun of an IRL method has received much less attention. We introduce a formal framework for online IRL, called incremental IRL (12RL), and a new method that advances maximum entropy IRL with hidden variables, to this setting. Our analysis shows that the new method has a monotonically improving performance with more demonstration data, as well as probabilistically bounded error, both under full and partial observability. Experiments in a simulated robotic application, which involves learning under occlusion, show the significantly improved performance of 12RL as compared to both batch IRL and an online imitation learning method.
引用
收藏
页码:1170 / 1178
页数:9
相关论文
共 50 条
  • [1] I2RL: online inverse reinforcement learning under occlusion
    Saurabh Arora
    Prashant Doshi
    Bikramjit Banerjee
    [J]. Autonomous Agents and Multi-Agent Systems, 2021, 35
  • [2] I2RL: online inverse reinforcement learning under occlusion
    Arora, Saurabh
    Doshi, Prashant
    Banerjee, Bikramjit
    [J]. AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2021, 35 (01)
  • [3] Multi-Robot Inverse Reinforcement Learning under Occlusion with Interactions
    Bogert, Kenneth
    Doshi, Prashant
    [J]. AAMAS'14: PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2014, : 173 - 180
  • [4] Multi-robot inverse reinforcement learning under occlusion with estimation of state transitions
    Bogert, Kenneth
    Doshi, Prashant
    [J]. ARTIFICIAL INTELLIGENCE, 2018, 263 : 46 - 73
  • [5] Multi-Robot Inverse Reinforcement Learning Under Occlusion with State Transition Estimation
    Bogert, Kenneth
    Doshi, Prashant
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS (AAMAS'15), 2015, : 1837 - 1838
  • [6] Online inverse reinforcement learning with limited data
    Self, Ryan
    Mahmud, S. M. Nahid
    Hareland, Katrine
    Kamalapurkar, Rushikesh
    [J]. 2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 603 - 608
  • [7] Scaling Expectation-Maximization for Inverse Reinforcement Learning to Multiple Robots under Occlusion
    Bogert, Kenneth
    Doshi, Prashant
    [J]. AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 522 - 529
  • [8] Online inverse reinforcement learning for nonlinear systems
    Self, Ryan
    Harlan, Michael
    Kamalapurkar, Rushikesh
    [J]. 2019 3RD IEEE CONFERENCE ON CONTROL TECHNOLOGY AND APPLICATIONS (IEEE CCTA 2019), 2019, : 296 - 301
  • [9] Online inverse reinforcement learning for systems with disturbances
    Self, Ryan
    Abudia, Moad
    Kamalapurkar, Rushikesh
    [J]. 2020 AMERICAN CONTROL CONFERENCE (ACC), 2020, : 1118 - 1123
  • [10] Generative Inverse Deep Reinforcement Learning for Online Recommendation
    Chen, Xiaocong
    Yao, Lina
    Sun, Aixin
    Wang, Xianzhi
    Xu, Xiwei
    Zhu, Liming
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 201 - 210