Learning task-relevant representations via rewards and real actions for reinforcement learning

被引:0
|
作者
Yuan, Linghui [1 ]
Lu, Xiaowei [1 ]
Liu, Yunlong [1 ]
机构
[1] Xiamen Univ, Dept Automat, Xiamen, Peoples R China
关键词
Visual reinforcement learning; Task-relevant representations; Representation learning method; Conditional mutual information maximization;
D O I
10.1016/j.knosys.2024.111788
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The input of visual reinforcement learning often contains redundant information, which will reduce the decision efficiency and decrease the performance of the agent. To address this issue, task-relevant representations of the input are usually learned, where only task-related information is preserved for the decision making. While in the literature, auxiliary tasks that are constructed by using reward signals, optimal policy or by extracting controllable elements in the input are commonly adopted to learn the task-relevant representations, the methods based on reward signals do not work well in sparse reward environments, the effectiveness of methods using optimal policy relies heavily on the optimality degree of the given policy, and the methods by extracting the controllable elements will ignore the uncontrollable task-relevant information in the input. To alleviate these problems and to learn better task-relevant representations, in this paper, we first encourage the encoder to encode controllable parts in the input by maximizing the conditional mutual information between the representations and agent's real actions. And then as reward signals are directly related to the underlying tasks , they are used to make more task-related information encoded irrespective of whether such information is controllable or not. Finally, a temporal coherence constraint is incorporated into the whole framework to reduce the task-irrelevant information in the representations. Experiments on Distracting DeepMind Control Suite and autonomous driving simulator CARLA show that our proposed approach can achieve better performances than some state-of-the-art (SOTA) baselines, which demonstrates the method's effectiveness in enhancing the agent's decision efficiency and overall performances. Code is available at https://github.com/DMU-XMU/Learning-Task-relevant-Representations-via-Rewards-and-Real-Actions-forReinforcement-Learning.git.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Finding Task-Relevant Features for Few-Shot Learning by Category Traversal
    Li, Hongyang
    Eigen, David
    Dodge, Samuel
    Zeiler, Matthew
    Wang, Xiaogang
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1 - 10
  • [32] The motor learning effects of combining an external attentional focus and task-relevant autonomy
    Sadowski, Jerzy
    Chaliburda, Agata
    Markwell, Logan
    Wolosz, Pawel
    Cieslinski, Igor
    Niznikowski, Tomasz
    Mastalerz, Andrzej
    Makaruk, Hubert
    JOURNAL OF SPORT & EXERCISE PSYCHOLOGY, 2024, 46 : S42 - S42
  • [33] Learning Intrinsic Symbolic Rewards in Reinforcement Learning
    Sheikh, Hassam Ullah
    Khadka, Shauharda
    Miret, Santiago
    Majumdar, Somdeb
    Phielipp, Mariano
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [34] Online learning of shaping rewards in reinforcement learning
    Grzes, Marek
    Kudenko, Daniel
    NEURAL NETWORKS, 2010, 23 (04) : 541 - 550
  • [35] Integration of imitation learning using GAIL and reinforcement learning using task-achievement rewards via probabilistic graphical model
    Kinose, Akira
    Taniguchi, Tadahiro
    ADVANCED ROBOTICS, 2020, 34 (16) : 1055 - 1067
  • [36] Reinforcement Learning with Perturbed Rewards
    Wang, Jingkang
    Liu, Yang
    Li, Bo
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6202 - 6209
  • [37] Contrastive Visual Explanations for Reinforcement Learning via Counterfactual Rewards
    Liu, Xiaowei
    McAreavey, Kevin
    Liu, Weiru
    EXPLAINABLE ARTIFICIAL INTELLIGENCE, XAI 2023, PT II, 2023, 1902 : 72 - 87
  • [38] Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards
    Chan, Hou Pong
    Chen, Wang
    Wang, Lu
    King, Irwin
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2163 - 2174
  • [39] Learning Representations via a Robust Behavioral Metric for Deep Reinforcement Learning
    Chen, Jianda
    Pan, Sinno Jialin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [40] Reinforcement Learning via Auxiliary Task Distillation
    Harish, Abhinav Narayan
    Heck, Larry
    Hanna, Josiah P.
    Zsolt
    Szot, Andrew
    COMPUTER VISION - ECCV 2024, PT LXXXI, 2025, 15139 : 214 - 230