Multi-robot inverse reinforcement learning under occlusion with estimation of state transitions

被引：6

作者：

Bogert, Kenneth ^{[1
]}

Doshi, Prashant ^{[2
]}

机构：

[1] Univ N Carolina, Dept Comp Sci, Asheville, NC 28804 USA

[2] Univ Georgia, Dept Comp Sci, THING Lab, Athens, GA 30602 USA

来源：

ARTIFICIAL INTELLIGENCE | 2018年 / 263卷

关键词：

Inverse reinforcement learning; Robotics; Machine learning; Maximum entropy; EM ALGORITHM; COMPLEXITY;

D O I：

10.1016/j.artint.2018.07.002

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Inverse reinforcement learning (IRL), analogously to RL, refers to both the problem and associated methods by which an agent passively observing another agent's actions over time, seeks to learn the latter's reward function. The learning agent is typically called the learner while the observed agent is often an expert in popular applications such as in learning from demonstrations. Some of the assumptions that underlie current IRL methods are impractical for many robotic applications. Specifically, they assume that the learner has full observability of the expert as it performs its task; that the learner has full knowledge of the expert's dynamics; and that there is always only one expert agent in the environment. For example, these assumptions are particularly restrictive in our application scenario where a subject robot is tasked with penetrating a perimeter patrol by two other robots after observing them from a vantage point. In our instance of this problem, the learner can observe at most 10% of the patrol. We relax these assumptions and systematically generalize a known IRL method, Maximum Entropy IRL, to enable the subject to learn the preferences of the patrolling robots, subsequently their behaviors, and predict their future positions well enough to plan a route to its goal state without being spotted. Challenged by occlusion, multiple interacting robots, and partially known dynamics we demonstrate empirically that the generalization improves significantly on several baselines in its ability to inversely learn in this application setting. Of note, it leads to significant improvement in the learner's overall success rate of penetrating the patrols. Our methods represent significant steps towards making IRL pragmatic and applicable to real-world contexts. (C) 2018 Elsevier B.V. All rights reserved.

引用

页码：46 / 73

页数：28

共 50 条

[41] Heterogeneous Multi-Robot Cooperation With Asynchronous Multi-Agent Reinforcement Learning
Zhang, Han
Zhang, Xiaohui
Feng, Zhao
Xiao, Xiaohui
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (01): : 159 - 166
[42] Fuzzy Reinforcement Learning and Curriculum Transfer Learning for Micromanagement in Multi-Robot Confrontation
Hu, Chunyang
Xu, Meng
[J]. INFORMATION, 2019, 10 (11)
[43] Multi-Agent Deep Reinforcement Learning for Multi-Robot Applications: A Survey
Orr, James
Dutta, Ayan
[J]. SENSORS, 2023, 23 (07)
[44] Improving Fast Adaptation for Newcomers in Multi-robot Reinforcement Learning System
Li, Yiying
Zhou, Wei
Wang, Huaimin
Ding, Bo
Xu, Kele
[J]. 2019 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI 2019), 2019, : 753 - 760
[45] Connectivity Guaranteed Multi-robot Navigation via Deep Reinforcement Learning
Lin, Juntong
Yang, Xuyun
Zheng, Peiwei
Cheng, Hui
[J]. CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
[46] Bayesian Reinforcement Learning for Multi-Robot Decentralized Patrolling in Uncertain Environments
Zhou, Xin
Wang, Weiping
Wang, Tao
Lei, Yonglin
Zhong, Fangcheng
[J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2019, 68 (12) : 11691 - 11703
[47] Cooperative Multi-Robot Navigation in Dynamic Environment with Deep Reinforcement Learning
Han, Ruihua
Chen, Shengduo
Hao, Qi
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 448 - 454
[48] A reinforcement learning technique with an adaptive action generator for a multi-robot system
Yasuda, Toshiyuki
Ohkura, Kazuhiro
[J]. FROM ANIMALS TO ANIMATS 10, PROCEEDINGS, 2008, 5040 : 250 - 259
[49] Simulation of multi-robot reinforcement learning for box-pushing problem
Kovac, K
Zivkovic, I
Basic, BD
[J]. MELECON 2004: PROCEEDINGS OF THE 12TH IEEE MEDITERRANEAN ELECTROTECHNICAL CONFERENCE, VOLS 1-3, 2004, : 603 - 606
[50] Deep Reinforcement Learning for Decentralized Multi-Robot Exploration With Macro Actions
Tan, Aaron Hao
Bejarano, Federico Pizarro
Zhu, Yuhan
Ren, Richard
Nejat, Goldie
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (01) : 272 - 279

← 1 2 3 4 5 →