Sample Efficient Reinforcement Learning Using Graph-Based Memory Reconstruction

被引：0

作者：

Kang Y. ^{[1
,2
]}

Zhao E. ^{[1
,2
]}

Zang Y. ^{[1
,2
]}

Li L. ^{[2
]}

Li K. ^{[2
]}

Tao P. ^{[3
]}

Xing J. ^{[3
]}

机构：

[1] School of Artificial Intelligence, University of Chinese, Academy of Sciences, Beijing

[2] Institute of Automation, Chinese Academy of Sciences, Beijing

[3] Department of Computer Science and Technology, Tsinghua University, Beijing

来源：

IEEE Transactions on Artificial Intelligence | 2024年 / 5卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Experience replay (ER); graph model; memory reconstruction; reinforcement learning (RL); sample efficiency;

D O I：

10.1109/TAI.2023.3268612

中图分类号：

学科分类号：

摘要：

Reinforcement learning (RL) algorithms typically require orders of magnitude more interactions than humans to learn effective policies. Research on memory in neuroscience suggests that humans' learning efficiency benefits from associating their experiences and reconstructing potential events. Inspired by this finding, we introduce a human brainlike memory structure for agents and build a general learning framework based on this structure to improve the RL sampling efficiency. Since this framework is similar to the memory reconstruction process in psychology, we name the newly proposed RL framework as graph-based memory reconstruction (GBMR). In particular, GBMR first maintains an attribute graph on the agent's memory and then retrieves its critical nodes to build and update potential paths among these nodes. This novel pipeline drives the RL agent to learn faster with its memory-enhanced value functions and reduces interactions with the environment by reconstructing its valuable paths. Extensive experimental analyses and evaluations in the grid maze and some challenging Atari environments demonstrate GBMRs superiority over traditional RL methods. We will release the source code and trained models to facilitate further studies in this research direction. © 2023 IEEE.

引用

页码：751 / 762

页数：11

共 50 条

[1] Graph-Based Skill Acquisition For Reinforcement Learning
Mendonca, Matheus R. F.
Ziviani, Artur
Barreto, Andre M. S.
ACM COMPUTING SURVEYS, 2019, 52 (01)
[2] A Knowledge Graph-based Interactive Recommender System Using Reinforcement Learning
Sun, Ruoxi
Yan, Jun
Ren, Fenghui
2022 TENTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA, CBD, 2022, : 73 - 78
[3] Handling Efficient VNF Placement with Graph-Based Reinforcement Learning for SFC Fault Tolerance
Ros, Seyha
Tam, Prohim
Song, Inseok
Kang, Seungwoo
Kim, Seokhoon
ELECTRONICS, 2024, 13 (13)
[4] Graph-Based Design of Hierarchical Reinforcement Learning Agents
Tateo, Davide
Erdenlig, Idil Su
Bonarini, Andrea
2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 1003 - 1009
[5] Trip Reinforcement Recommendation with Graph-based Representation Learning
Chen, Lei
Cao, Jie
Tao, Haicheng
Wu, Jia
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2023, 17 (04)
[6] Sample Efficient Graph-Based Optimization with Noisy Observations
Nguyen, Tan
Shameli, Ali
Abbasi-Yadkori, Yasin
Rao, Anup
Kveton, Branislav
22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
[7] Cyber threat response using reinforcement learning in graph-based attack simulations
Nyberg, Jakob
Johnson, Pontus
Mehes, Andras
PROCEEDINGS OF THE IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM 2022, 2022,
[8] Asymmetric Graph-Based Deep Reinforcement Learning for Portfolio Optimization
Sun, Haoyu
Liu, Xin
Bian, Yuxuan
Zhu, Peng
Cheng, Dawei
Liang, Yuqi
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES-APPLIED DATA SCIENCE TRACK, PT IX, ECML PKDD 2024, 2024, 14949 : 174 - 189
[9] Graph-based Cluttered Scene Generation and Interactive Exploration using Deep Reinforcement Learning
Kumar, K. Niranjan
Essa, Irfan
Ha, Sehoon
2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 7521 - 7527
[10] Decentralized graph-based multi-agent reinforcement learning using reward machines
Hu, Jueming
Xu, Zhe
Wang, Weichang
Qu, Guannan
Pang, Yutian
Liu, Yongming
NEUROCOMPUTING, 2024, 564

← 1 2 3 4 5 →