A survey of inverse reinforcement learning

被引：0

作者：

Stephen Adams

Tyler Cody

Peter A. Beling

机构：

[1] Virginia Tech,Hume Center for National Security and Technology

来源：

Artificial Intelligence Review | 2022年 / 55卷

关键词：

Reinforcement learning; Inverse reinforcement learning; Inverse optimal control; Apprenticeship learning; Learning from demonstration;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Learning from demonstration, or imitation learning, is the process of learning to act in an environment from examples provided by a teacher. Inverse reinforcement learning (IRL) is a specific form of learning from demonstration that attempts to estimate the reward function of a Markov decision process from examples provided by the teacher. The reward function is often considered the most succinct description of a task. In simple applications, the reward function may be known or easily derived from properties of the system and hard coded into the learning process. However, in complex applications, this may not be possible, and it may be easier to learn the reward function by observing the actions of the teacher. This paper provides a comprehensive survey of the literature on IRL. This survey outlines the differences between IRL and two similar methods - apprenticeship learning and inverse optimal control. Further, this survey organizes the IRL literature based on the principal method, describes applications of IRL algorithms, and provides areas of future research.

引用

页码：4307 / 4346

页数：39

共 50 条

[21] Training parsers by inverse reinforcement learning
Gergely Neu
Csaba Szepesvári
[J]. Machine Learning, 2009, 77 : 303 - 337
[22] Compatible Reward Inverse Reinforcement Learning
Metelli, Alberto Maria
Pirotta, Matteo
Restelli, Marcello
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[23] Preference Elicitation and Inverse Reinforcement Learning
Rothkopf, Constantin A.
Dimitrakakis, Christos
[J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT III, 2011, 6913 : 34 - 48
[24] Inverse Reinforcement Learning with Constraint Recovery
Das, Nirjhar
Chattopadhyay, Arpan
[J]. PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2023, 2023, 14301 : 179 - 188
[25] Hierarchical Bayesian Inverse Reinforcement Learning
Choi, Jaedeug
Kim, Kee-Eung
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (04) : 793 - 805
[26] Inverse Reinforcement Learning for Strategy Identification
Rucker, Mark
Adams, Stephen
Hayes, Roy
Beling, Peter A.
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 3067 - 3074
[27] Inverse reinforcement learning in contextual MDPs
Belogolovsky, Stav
Korsunsky, Philip
Mannor, Shie
Tessler, Chen
Zahavy, Tom
[J]. MACHINE LEARNING, 2021, 110 (09) : 2295 - 2334
[28] Inverse reinforcement learning in contextual MDPs
Stav Belogolovsky
Philip Korsunsky
Shie Mannor
Chen Tessler
Tom Zahavy
[J]. Machine Learning, 2021, 110 : 2295 - 2334
[29] Training parsers by inverse reinforcement learning
Neu, Gergely
Szepesvari, Csaba
[J]. MACHINE LEARNING, 2009, 77 (2-3) : 303 - 337
[30] Recent Advancements in Inverse Reinforcement Learning
Metelli, Alberto Maria
[J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22680 - 22680

← 1 2 3 4 5 →