Risk-sensitive Inverse Reinforcement Learning via Coherent Risk Models

被引：0

作者：

Majumdar, Anirudha ^{[1
]}

Singh, Sumeet ^{[1
]}

Mandlekar, Ajay ^{[2
]}

Pavone, Marco ^{[1
]}

机构：

[1] Stanford Univ, Dept Aeronaut & Astronaut, Stanford, CA 94305 USA

[2] Stanford Univ, Elect Engn, Stanford, CA 94305 USA

来源：

ROBOTICS: SCIENCE AND SYSTEMS XIII | 2017年

关键词：

MARKOV DECISION-PROCESSES; EXPECTED-UTILITY;

D O I：

暂无

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

The literature on Inverse Reinforcement Learning (IRL) typically assumes that humans take actions in order to minimize the expected value of a cost function, i.e., that humans are risk neutral. Yet, in practice, humans are often far from being risk neutral. To fill this gap, the objective of this paper is to devise a framework for risk-sensitive IRL in order to explicitly account for an expert's risk sensitivity. To this end, we propose a flexible class of models based on coherent risk metrics, which allow us to capture an entire spectrum of risk preferences from risk-neutral to worst-case. We propose efficient algorithms based on Linear Programming for inferring an expert's underlying risk metric and cost function for a rich class of static and dynamic decision-making settings. The resulting approach is demonstrated on a simulated driving game with ten human participants. Our method is able to infer and mimic a wide range of qualitatively different driving styles from highly risk-averse to risk-neutral in a data-efficient manner. Moreover, comparisons of the Risk-Sensitive (RS) IRL approach with a risk-neutral model show that the RS-IRL framework more accurately captures observed participant behavior both qualitatively and quantitatively.

引用

页数：10

共 50 条

[1] Inverse Risk-Sensitive Reinforcement Learning
Ratliff, Lillian J.
Mazumdar, Eric
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2020, 65 (03) : 1256 - 1263
[2] Risk-Sensitive Reinforcement Learning
Shen, Yun
Tobia, Michael J.
Sommer, Tobias
Obermayer, Klaus
NEURAL COMPUTATION, 2014, 26 (07) : 1298 - 1328
[3] Risk-sensitive reinforcement learning
Mihatsch, O
Neuneier, R
MACHINE LEARNING, 2002, 49 (2-3) : 267 - 290
[4] Risk-Sensitive Reinforcement Learning
Oliver Mihatsch
Ralph Neuneier
Machine Learning, 2002, 49 : 267 - 290
[5] Gradient-Based Inverse Risk-Sensitive Reinforcement Learning
Mazumdar, Eric
Ratliff, Lillian J.
Fiez, Tanner
Sastry, S. Shankar
2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
[6] Risk-Sensitive Reinforcement Learning via Policy Gradient Search
Prashanth, L. A.
Fu, Michael C.
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2022, 15 (05): : 537 - 693
[7] Risk-sensitive inverse reinforcement learning via semi- and non-parametric methods
Singh, Sumeet
Lacotte, Jonathan
Majumdar, Anirudha
Pavone, Marco
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2018, 37 (13-14): : 1713 - 1740
[8] Risk-Sensitive Policy with Distributional Reinforcement Learning
Theate, Thibaut
Ernst, Damien
ALGORITHMS, 2023, 16 (07)
[9] A Probabilistic Perspective on Risk-sensitive Reinforcement Learning
Noorani, Erfaun
Baras, John S.
2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 2697 - 2702
[10] Regret Bounds for Risk-Sensitive Reinforcement Learning
Bastani, Osbert
Ma, Yecheng Jason
Shen, Estelle
Xu, Wanqiao
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,

← 1 2 3 4 5 →