Contrastive Learning Methods for Deep Reinforcement Learning

被引:2
|
作者
Wang, Di [1 ]
Hu, Mengqi [1 ]
机构
[1] Univ Illinois, Dept Mech & Ind Engn, Chicago, IL 60609 USA
基金
美国国家科学基金会;
关键词
Contrastive learning; deep reinforcement learning; different-age experience; experience replay buffer; parallel learning; BUFFER;
D O I
10.1109/ACCESS.2023.3312383
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep reinforcement learning (DRL) has shown promising performance in various application areas (e.g., games and autonomous vehicles). Experience replay buffer strategy and parallel learning strategy are widely used to boost the performances of offline and online deep reinforcement learning algorithms. However, state-action distribution shifts lead to bootstrap errors. Experience replay buffer learns policies with elder experience trajectories, limiting its application to off-policy algorithms. Balancing the new and the old experience is challenging. Parallel learning strategies can train policies with online experiences. However, parallel environmental instances organize the agent pool inefficiently with higher simulation or physical costs. To overcome these shortcomings, we develop four lightweight and effective DRL algorithms, instance-actor, parallel-actor, instance-critic, and parallel-critic methods, to contrast different-age trajectory experiences. We train the contrast DRL according to the received rewards and proposed contrast loss, which is calculated by designed positive/negative keys. Our benchmark experiments using PyBullet robotics environments show that our proposed algorithm matches or is better than the state-of-the-art DRL algorithms.
引用
收藏
页码:97107 / 97117
页数:11
相关论文
共 50 条
  • [1] Semisupervised Learning for Noise Suppression Using Deep Reinforcement Learning of Contrastive Features
    Kazemi, Ehsan
    Taherkhani, Fariborz
    Wang, Liqiang
    [J]. IEEE SENSORS LETTERS, 2023, 7 (04)
  • [2] Generalized Representation Learning Methods for Deep Reinforcement Learning
    Zhu, Hanhua
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 5216 - 5217
  • [3] Asynchronous Methods for Deep Reinforcement Learning
    Mnih, Volodymyr
    Badia, Adria Puigdomenech
    Mirza, Mehdi
    Graves, Alex
    Harley, Tim
    Lillicrap, Timothy P.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [4] Masked Contrastive Representation Learning for Reinforcement Learning
    Zhu, Jinhua
    Xia, Yingce
    Wu, Lijun
    Deng, Jiajun
    Zhou, Wengang
    Qin, Tao
    Liu, Tie-Yan
    Li, Houqiang
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3421 - 3433
  • [5] Discrete-to-deep reinforcement learning methods
    Kurniawan, Budi
    Vamplew, Peter
    Papasimeon, Michael
    Dazeley, Richard
    Foale, Cameron
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (03): : 1713 - 1733
  • [6] Decomposition methods with deep corrections for reinforcement learning
    Maxime Bouton
    Kyle D. Julian
    Alireza Nakhaei
    Kikuo Fujimura
    Mykel J. Kochenderfer
    [J]. Autonomous Agents and Multi-Agent Systems, 2019, 33 : 330 - 352
  • [7] Sequential and Dynamic constraint Contrastive Learning for Reinforcement Learning
    Shen, Weijie
    Yuan, Lei
    Huang, Junfu
    Gao, Songyi
    Huang, Yuyang
    Yu, Yang
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [8] Decomposition methods with deep corrections for reinforcement learning
    Bouton, Maxime
    Julian, Kyle D.
    Nakhaei, Alireza
    Fujimura, Kikuo
    Kochenderfer, Mykel J.
    [J]. AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2019, 33 (03) : 330 - 352
  • [9] Discrete-to-deep reinforcement learning methods
    Budi Kurniawan
    Peter Vamplew
    Michael Papasimeon
    Richard Dazeley
    Cameron Foale
    [J]. Neural Computing and Applications, 2022, 34 : 1713 - 1733
  • [10] A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
    Morales, Eduardo F.
    Murrieta-Cid, Rafael
    Becerra, Israel
    Esquivel-Basaldua, Marco A.
    [J]. INTELLIGENT SERVICE ROBOTICS, 2021, 14 (05) : 773 - 805