Distributionally Robust Imitation Learning

被引：0

作者：

Bashiri, Mohammad Ali ^{[1
]}

Ziebart, Brian D. ^{[1
]}

Zhang, Xinhua ^{[1
]}

机构：

[1] Univ Illinois, Dept Comp Sci, Chicago, IL 60607 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷

基金：

美国国家科学基金会;

关键词：

OPTIMIZATION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We consider the imitation learning problem of learning a policy in a Markov Decision Process (MDP) setting where the reward function is not given, but demonstrations from experts are available. Although the goal of imitation learning is to learn a policy that produces behaviors nearly as good as the experts' for a desired task, assumptions of consistent optimality for demonstrated behaviors are often violated in practice. Finding a policy that is distributionally robust against noisy demonstrations based on an adversarial construction potentially solves this problem by avoiding optimistic generalizations of the demonstrated data. This paper studies Distributionally Robust Imitation Learning ( DROIL) and establishes a close connection between DROIL and Maximum Entropy Inverse Reinforcement Learning. We show that DROIL can be seen as a framework that maximizes a generalized concept of entropy. We develop a novel approach to transform the objective function into a convex optimization problem over a polynomial number of variables for a class of loss functions that are additive over state and action spaces. Our approach lets us optimize both stationary and non-stationary policies and, unlike prevalent previous methods, it does not require repeatedly solving an inner reinforcement learning problem. We experimentally show the significant benefits of DROIL's new optimization method on synthetic data and a highway driving environment.

引用

页数：14

共 50 条

[1] Distributionally Robust Behavioral Cloning for Robust Imitation Learning
Panaganti, Kishan
Xu, Zaiyan
Kalathil, Dileep
Ghavamzadeh, Mohammad
[J]. 2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 1342 - 1347
[2] Distributionally Robust Q-Learning
Liu, Zijian
Bai, Qinxun
Blanchet, Jose
Dong, Perry
Xu, Wei
Zhou, Zhengqing
Zhou, Zhengyuan
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[3] Efficient Generalization with Distributionally Robust Learning
Ghosh, Soumyadip
Squillante, Mark S.
Wollega, Ebisa D.
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[4] Does Distributionally Robust Supervised Learning Give Robust Classifiers?
Hu, Weihua
Niu, Gang
Sato, Issei
Sugiyama, Masashi
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
[5] Distributionally Robust Learning With Stable Adversarial Training
Liu, Jiashuo
Shen, Zheyan
Cui, Peng
Zhou, Linjun
Kuang, Kun
Li, Bo
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (11) : 11288 - 11300
[6] A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization
Chen, Ruidi
Paschalidis, Ioannis Ch.
[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 19
[7] Doubly Robust Distributionally Robust Off-Policy Evaluation and Learning
Kallus, Nathan
Mao, Xiaojie
Wang, Kaiwen
Zhou, Zhengyuan
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 10598 - 10632
[8] Distributionally Robust Edge Learning with Dirichlet Process Prior
Zhang, Zhaofeng
Chen, Yue
Zhang, Junshan
[J]. 2020 IEEE 40TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2020, : 798 - 808
[9] Distributionally Robust Skeleton Learning of Discrete Bayesian Networks
Li, Yeshu
Ziebart, Brian D.
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[10] Distributionally Robust Federated Learning for Mobile Edge Networks
Le, Long Tan
Nguyen, Tung-Anh
Nguyen, Tuan-Dung
Tran, Nguyen H.
Truong, Nguyen Binh
Vo, Phuong L.
Hung, Bui Thanh
Le, Tuan Anh
[J]. MOBILE NETWORKS & APPLICATIONS, 2024, 29 (1): : 262 - 272

← 1 2 3 4 5 →