Contrastive Learning Methods for Deep Reinforcement Learning

被引：2

作者：

Wang, Di ^{[1
]}

Hu, Mengqi ^{[1
]}

机构：

[1] Univ Illinois, Dept Mech & Ind Engn, Chicago, IL 60609 USA

来源：

IEEE ACCESS | 2023年 / 11卷

基金：

美国国家科学基金会;

关键词：

Contrastive learning; deep reinforcement learning; different-age experience; experience replay buffer; parallel learning; BUFFER;

D O I：

10.1109/ACCESS.2023.3312383

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep reinforcement learning (DRL) has shown promising performance in various application areas (e.g., games and autonomous vehicles). Experience replay buffer strategy and parallel learning strategy are widely used to boost the performances of offline and online deep reinforcement learning algorithms. However, state-action distribution shifts lead to bootstrap errors. Experience replay buffer learns policies with elder experience trajectories, limiting its application to off-policy algorithms. Balancing the new and the old experience is challenging. Parallel learning strategies can train policies with online experiences. However, parallel environmental instances organize the agent pool inefficiently with higher simulation or physical costs. To overcome these shortcomings, we develop four lightweight and effective DRL algorithms, instance-actor, parallel-actor, instance-critic, and parallel-critic methods, to contrast different-age trajectory experiences. We train the contrast DRL according to the received rewards and proposed contrast loss, which is calculated by designed positive/negative keys. Our benchmark experiments using PyBullet robotics environments show that our proposed algorithm matches or is better than the state-of-the-art DRL algorithms.

引用

页码：97107 / 97117

页数：11

共 50 条

[1] Semisupervised Learning for Noise Suppression Using Deep Reinforcement Learning of Contrastive Features
Kazemi, Ehsan
Taherkhani, Fariborz
Wang, Liqiang
[J]. IEEE SENSORS LETTERS, 2023, 7 (04)
[2] Generalized Representation Learning Methods for Deep Reinforcement Learning
Zhu, Hanhua
[J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 5216 - 5217
[3] Asynchronous Methods for Deep Reinforcement Learning
Mnih, Volodymyr
Badia, Adria Puigdomenech
Mirza, Mehdi
Graves, Alex
Harley, Tim
Lillicrap, Timothy P.
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
[4] Masked Contrastive Representation Learning for Reinforcement Learning
Zhu, Jinhua
Xia, Yingce
Wu, Lijun
Deng, Jiajun
Zhou, Wengang
Qin, Tao
Liu, Tie-Yan
Li, Houqiang
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3421 - 3433
[5] Discrete-to-deep reinforcement learning methods
Kurniawan, Budi
Vamplew, Peter
Papasimeon, Michael
Dazeley, Richard
Foale, Cameron
[J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (03): : 1713 - 1733
[6] Decomposition methods with deep corrections for reinforcement learning
Maxime Bouton
Kyle D. Julian
Alireza Nakhaei
Kikuo Fujimura
Mykel J. Kochenderfer
[J]. Autonomous Agents and Multi-Agent Systems, 2019, 33 : 330 - 352
[7] Sequential and Dynamic constraint Contrastive Learning for Reinforcement Learning
Shen, Weijie
Yuan, Lei
Huang, Junfu
Gao, Songyi
Huang, Yuyang
Yu, Yang
[J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[8] Decomposition methods with deep corrections for reinforcement learning
Bouton, Maxime
Julian, Kyle D.
Nakhaei, Alireza
Fujimura, Kikuo
Kochenderfer, Mykel J.
[J]. AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2019, 33 (03) : 330 - 352
[9] Discrete-to-deep reinforcement learning methods
Budi Kurniawan
Peter Vamplew
Michael Papasimeon
Richard Dazeley
Cameron Foale
[J]. Neural Computing and Applications, 2022, 34 : 1713 - 1733
[10] A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
Morales, Eduardo F.
Murrieta-Cid, Rafael
Becerra, Israel
Esquivel-Basaldua, Marco A.
[J]. INTELLIGENT SERVICE ROBOTICS, 2021, 14 (05) : 773 - 805

← 1 2 3 4 5 →