RAIL: Risk-Averse Imitation Learning Extended Abstract

被引:0
|
作者
Santara, Anirban [1 ,2 ]
Naik, Abhishek [1 ,3 ,4 ]
Ravindran, Balaraman [3 ,4 ]
Das, Dipankar [5 ]
Mudigere, Dheevatsa [5 ]
Avancha, Sasikanth [5 ]
Kaul, Bharat [5 ]
机构
[1] Intel Labs, Bangalore, Karnataka, India
[2] Indian Inst Technol Kharagpur, Kharagpur, W Bengal, India
[3] Indian Inst Technol Madras, Dept CSE, Madras, Tamil Nadu, India
[4] Indian Inst Technol Madras, Robert Bosch Ctr Data Sci & AI, Madras, Tamil Nadu, India
[5] Intel Labs, Parallel Comp Lab, Bangalore, Karnataka, India
关键词
Reinforcement Learning; Imitation Learning; Risk Minimization; Conditional-Value-at-Risk; Reliability;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Imitation learning algorithms learn viable policies by imitating an expert's behavior when reward signals are not available. Generative Adversarial Imitation Learning (GAIL) is a state-of-the-art algorithm for learning policies when the expert's behavior is available as a fixed set of trajectories. We evaluate in terms of the expert's cost function and observe that the distribution of trajectory-costs is often more heavy-tailed for GAIL-agents than the expert at a number of benchmark continuous-control tasks. Thus, high-cost trajectories, corresponding to tail-end events of catastrophic failure, are more likely to be encountered by the GAIL-agents than the expert. This makes the reliability of GAIL-agents questionable when it comes to deployment in risk-sensitive applications like robotic surgery and autonomous driving. In this work, we aim to minimize the occurrence of tail-end events by minimizing tail risk within the GAIL framework. We quantify tail risk by the Conditional-Value-at-Risk (CVaR) of trajectories and develop the Risk-Averse Imitation Learning (RAIL) algorithm. We observe that the policies learned with RAIL show lower tail-end risk than those of vanilla GAIL. Thus, the proposed RAIL algorithm appears as a potent alternative to GAIL for improved reliability in risk-sensitive applications.
引用
收藏
页码:2062 / 2063
页数:2
相关论文
共 50 条
  • [21] Risk-Averse Selfish Routing
    Lianeas, Thanasis
    Nikolova, Evdokia
    Stier-Moses, Nicolas E.
    [J]. MATHEMATICS OF OPERATIONS RESEARCH, 2019, 44 (01) : 38 - 57
  • [22] Risk-Averse Production Planning
    Kawas, Ban
    Laumanns, Marco
    Pratsini, Eleni
    Prestwich, Steve
    [J]. ALGORITHMIC DECISION THEORY, 2011, 6992 : 108 - +
  • [23] Towards Risk-Averse Edge Computing With Deep Reinforcement Learning
    Xu, Dianlei
    Su, Xiang
    Wang, Huandong
    Tarkoma, Sasu
    Hui, Pan
    [J]. IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (06) : 7030 - 7047
  • [24] Risk-Averse No-Regret Learning in Online Convex Games
    Wang, Zifan
    Shen, Yi
    Zavlanos, Michael M.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [25] Risk-Averse Learning for Reliable mmWave Self-Backhauling
    Gargari, Amir Ashtari
    Ortiz, Andrea
    Pagin, Matteo
    de Sombre, Wanja
    Zorzi, Michele
    Asadi, Arash
    [J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 2024, : 4989 - 5003
  • [26] Reward Adjustment Reinforcement Learning for risk-averse asset allocation
    Li, Jian
    Chan, Laiwan
    [J]. 2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 534 - +
  • [27] Risk-Averse Trees for Learning from Logged Bandit Feedback
    Trovo, Francesco
    Paladino, Stefano
    Simone, Paolo
    Restelli, Marcello
    Gatti, Nicola
    [J]. 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 976 - 983
  • [28] RATIONAL LEARNING FOR RISK-AVERSE INVESTORS BY CONDITIONING ON BEHAVIORAL CHOICES
    Costola, Michele
    Caporin, Massimiliano
    [J]. ANNALS OF FINANCIAL ECONOMICS, 2016, 11 (01)
  • [29] Learning with risk-averse feedback under potentially heavy tails
    Holland, Matthew J.
    Haress, El Mehdi
    [J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [30] Risk-averse Distributional Reinforcement Learning: A CVaR Optimization Approach
    Stanko, Silvestr
    Macek, Karel
    [J]. IJCCI: PROCEEDINGS OF THE 11TH INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE, 2019, : 412 - 423