Masked and Inverse Dynamics Modeling for Data-Efficient Reinforcement Learning

被引：0

作者：

Lee, Young Jae ^{[1
]}

Kim, Jaehoon ^{[1
]}

Park, Young Joon ^{[2
]}

Kwak, Mingu ^{[3
]}

Kim, Seoung Bum ^{[1
]}

机构：

[1] Korea Univ, Dept Ind & Management Engn, Seoul 02841, South Korea

[2] LG AI Res, Seoul 07796, South Korea

[3] Georgia Inst Technol, Sch Ind & Syst Engn, Atlanta, GA 30332 USA

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年

基金：

新加坡国家研究基金会;

关键词：

Data models; Data augmentation; Transformers; Task analysis; Representation learning; Predictive models; Inverse problems; Data-efficient reinforcement learning; inverse dynamics modeling; masked modeling; self-supervised multitask learning; transformer;

D O I：

10.1109/TNNLS.2024.3439261

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In pixel-based deep reinforcement learning (DRL), learning representations of states that change because of an agent's action or interaction with the environment poses a critical challenge in improving data efficiency. Recent data-efficient DRL studies have integrated DRL with self-supervised learning (SSL) and data augmentation to learn state representations from given interactions. However, some methods have difficulties in explicitly capturing evolving state representations or in selecting data augmentations for appropriate reward signals. Our goal is to explicitly learn the inherent dynamics that change with an agent's intervention and interaction with the environment. We propose masked and inverse dynamics modeling (MIND), which uses masking augmentation and fewer hyperparameters to learn agent-controllable representations in changing states. Our method is comprised of a self-supervised multitask learning that leverages a transformer architecture, which captures the spatiotemporal information underlying in the highly correlated consecutive frames. MIND uses two tasks to perform self-supervised multitask learning: masked modeling and inverse dynamics modeling. Masked modeling learns the static visual representation required for control in the state, and inverse dynamics modeling learns the rapidly evolving state representation with agent intervention. By integrating inverse dynamics modeling as a complementary component to masked modeling, our method effectively learns evolving state representations. We evaluate our method by using discrete and continuous control environments with limited interactions. MIND outperforms previous methods across benchmarks and significantly improves data efficiency. The code is available at https://github.com/dudwojae/MIND.

引用

页数：14

共 50 条

[1] Data-Efficient Hierarchical Reinforcement Learning
Nachum, Ofir
Gu, Shixiang
Lee, Honglak
Levine, Sergey
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[2] Data-Efficient Reinforcement Learning for Malaria Control
Zou, Lixin
[J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 507 - 513
[3] Pretraining Representations for Data-Efficient Reinforcement Learning
Schwarzer, Max
Rajkumar, Nitarshan
Noukhovitch, Michael
Anand, Ankesh
Charlin, Laurent
Hjelm, Devon
Bachman, Philip
Courville, Aaron
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[4] EqR: Equivariant Representations for Data-Efficient Reinforcement Learning
Mondal, Arnab Kumar
Jain, Vineet
Siddiqi, Kaleem
Ravanbakhsh, Siamak
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[5] Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data
Nie, Allen
Flet-Berliac, Yannis
Jordan, Deon R.
Steenbergen, William
Brunskill, Emma
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[6] Data-Efficient Reinforcement Learning for Variable Impedance Control
Anand, Akhil S.
Kaushik, Rituraj
Gravdahl, Jan Tommy
Abu-Dakka, Fares J.
[J]. IEEE ACCESS, 2024, 12 : 15631 - 15641
[7] BarlowRL: Barlow Twins for Data-Efficient Reinforcement Learning
Cagatan, Omer Veysel
Akgun, Baris
[J]. ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222
[8] Data-Efficient Offline Reinforcement Learning with Approximate Symmetries
Angelotti, Giorgio
Drougard, Nicolas
Chanel, Caroline P. C.
[J]. AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2023, 2024, 14546 : 164 - 186
[9] Optimistic Sampling Strategy for Data-Efficient Reinforcement Learning
Zhao, Dongfang
Liu, Jiafeng
Wu, Rui
Cheng, Dansong
Tang, Xianglong
[J]. IEEE ACCESS, 2019, 7 : 55763 - 55769
[10] Concurrent Credit Assignment for Data-efficient Reinforcement Learning
Dauce, Emmanuel
[J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,

← 1 2 3 4 5 →