Provably Efficient Imitation Learning from Observation Alone

被引：0

作者：

Sun, Wen ^{[1
]}

Vemula, Anirudh ^{[1
]}

Boots, Byron ^{[2
]}

Bagnell, J. Andrew ^{[3
]}

机构：

[1] Carnegie Mellon Univ, Robot Inst, Pittsburgh, PA 15213 USA

[2] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA

[3] Aurora Innovat, Pittsburgh, PA USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97 | 2019年 / 97卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study Imitation Learning (IL) from Observations alone (ILFO) in large-scale MDPs. While most IL algorithms rely on an expert to directly provide actions to the learner, in this setting the expert only supplies sequences of observations. We design a new model-free algorithm for ILFO, Forward Adversarial Imitation Learning (FAIL), which learns a sequence of time-dependent policies by minimizing an Integral Probability Metric between the observation distributions of the expert policy and the learner. FAIL is the first provably efficient algorithm in ILFO setting, which learns a near-optimal policy with a number of samples that is polynomial in all relevant parameters but independent of the number of unique observations. The resulting theory extends the domain of provably sample efficient learning algorithms beyond existing results, which typically only consider tabular reinforcement learning settings or settings that require access to a near-optimal reset distribution. We also demonstrate the efficacy of FAIL on multiple OpenAI Gym control tasks.

引用

页数：10

共 50 条

[31] Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets
Zhong, Han
Xiong, Wei
Tan, Jiyuan
Wang, Liwei
Zhang, Tong
Wang, Zhaoran
Yang, Zhuoran
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[32] Model-based Imitation Learning from Observation for input estimation in monitored systems
Liu, Wei
Lai, Zhilu
Stoura, Charikleia D.
Bacsa, Kiran
Chatzi, Eleni
MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2025, 225
[33] Sample-efficient Adversarial Imitation Learning
Jung, Dahuin
Lee, Hyungyu
Yoon, Sungroh
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
[34] NAO Robot Learns to Interact with Humans through Imitation Learning from Video Observation
Kolagar, Seyed Adel Alizadeh
Taheri, Alireza
Meghdari, Ali. F. F.
JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2023, 109 (01)
[35] NAO Robot Learns to Interact with Humans through Imitation Learning from Video Observation
Seyed Adel Alizadeh Kolagar
Alireza Taheri
Ali F. Meghdari
Journal of Intelligent & Robotic Systems, 2023, 109
[36] On Efficient Online Imitation Learning via Classification
Li, Yichen
Zhang, Chicheng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[37] Sample-efficient Adversarial Imitation Learning
Jung, Dahuin
Lee, Hyungyu
Yoon, Sungroh
Journal of Machine Learning Research, 2024, 25 : 1 - 32
[38] Efficient Imitation Learning with Conservative World Models
Kolev, Victor
Rafailov, Rafael
Hatch, Kyle
Wu, Jiajun
Finn, Chelsea
6TH ANNUAL LEARNING FOR DYNAMICS & CONTROL CONFERENCE, 2024, 242 : 1776 - 1789
[39] Sample-efficient Adversarial Imitation Learning
Jung, Dahuin
Lee, Hyungyu
Yoon, Sungroh
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 32
[40] Provably Feedback-Efficient Reinforcement Learning via Active Reward Learning
Kong, Dingwen
Yang, Lin F.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,

← 1 2 3 4 5 →