Risk-sensitive Inverse Reinforcement Learning via Coherent Risk Models

被引:0
|
作者
Majumdar, Anirudha [1 ]
Singh, Sumeet [1 ]
Mandlekar, Ajay [2 ]
Pavone, Marco [1 ]
机构
[1] Stanford Univ, Dept Aeronaut & Astronaut, Stanford, CA 94305 USA
[2] Stanford Univ, Elect Engn, Stanford, CA 94305 USA
来源
ROBOTICS: SCIENCE AND SYSTEMS XIII | 2017年
关键词
MARKOV DECISION-PROCESSES; EXPECTED-UTILITY;
D O I
暂无
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
The literature on Inverse Reinforcement Learning (IRL) typically assumes that humans take actions in order to minimize the expected value of a cost function, i.e., that humans are risk neutral. Yet, in practice, humans are often far from being risk neutral. To fill this gap, the objective of this paper is to devise a framework for risk-sensitive IRL in order to explicitly account for an expert's risk sensitivity. To this end, we propose a flexible class of models based on coherent risk metrics, which allow us to capture an entire spectrum of risk preferences from risk-neutral to worst-case. We propose efficient algorithms based on Linear Programming for inferring an expert's underlying risk metric and cost function for a rich class of static and dynamic decision-making settings. The resulting approach is demonstrated on a simulated driving game with ten human participants. Our method is able to infer and mimic a wide range of qualitatively different driving styles from highly risk-averse to risk-neutral in a data-efficient manner. Moreover, comparisons of the Risk-Sensitive (RS) IRL approach with a risk-neutral model show that the RS-IRL framework more accurately captures observed participant behavior both qualitatively and quantitatively.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Inverse Risk-Sensitive Reinforcement Learning
    Ratliff, Lillian J.
    Mazumdar, Eric
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2020, 65 (03) : 1256 - 1263
  • [2] Risk-Sensitive Reinforcement Learning
    Shen, Yun
    Tobia, Michael J.
    Sommer, Tobias
    Obermayer, Klaus
    NEURAL COMPUTATION, 2014, 26 (07) : 1298 - 1328
  • [3] Risk-sensitive reinforcement learning
    Mihatsch, O
    Neuneier, R
    MACHINE LEARNING, 2002, 49 (2-3) : 267 - 290
  • [4] Risk-Sensitive Reinforcement Learning
    Oliver Mihatsch
    Ralph Neuneier
    Machine Learning, 2002, 49 : 267 - 290
  • [5] Gradient-Based Inverse Risk-Sensitive Reinforcement Learning
    Mazumdar, Eric
    Ratliff, Lillian J.
    Fiez, Tanner
    Sastry, S. Shankar
    2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
  • [6] Risk-Sensitive Reinforcement Learning via Policy Gradient Search
    Prashanth, L. A.
    Fu, Michael C.
    FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2022, 15 (05): : 537 - 693
  • [7] Risk-sensitive inverse reinforcement learning via semi- and non-parametric methods
    Singh, Sumeet
    Lacotte, Jonathan
    Majumdar, Anirudha
    Pavone, Marco
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2018, 37 (13-14): : 1713 - 1740
  • [8] Risk-Sensitive Policy with Distributional Reinforcement Learning
    Theate, Thibaut
    Ernst, Damien
    ALGORITHMS, 2023, 16 (07)
  • [9] A Probabilistic Perspective on Risk-sensitive Reinforcement Learning
    Noorani, Erfaun
    Baras, John S.
    2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 2697 - 2702
  • [10] Regret Bounds for Risk-Sensitive Reinforcement Learning
    Bastani, Osbert
    Ma, Yecheng Jason
    Shen, Estelle
    Xu, Wanqiao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,