Learning task-relevant representations via rewards and real actions for reinforcement learning

被引:0
|
作者
Yuan, Linghui [1 ]
Lu, Xiaowei [1 ]
Liu, Yunlong [1 ]
机构
[1] Xiamen Univ, Dept Automat, Xiamen, Peoples R China
关键词
Visual reinforcement learning; Task-relevant representations; Representation learning method; Conditional mutual information maximization;
D O I
10.1016/j.knosys.2024.111788
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The input of visual reinforcement learning often contains redundant information, which will reduce the decision efficiency and decrease the performance of the agent. To address this issue, task-relevant representations of the input are usually learned, where only task-related information is preserved for the decision making. While in the literature, auxiliary tasks that are constructed by using reward signals, optimal policy or by extracting controllable elements in the input are commonly adopted to learn the task-relevant representations, the methods based on reward signals do not work well in sparse reward environments, the effectiveness of methods using optimal policy relies heavily on the optimality degree of the given policy, and the methods by extracting the controllable elements will ignore the uncontrollable task-relevant information in the input. To alleviate these problems and to learn better task-relevant representations, in this paper, we first encourage the encoder to encode controllable parts in the input by maximizing the conditional mutual information between the representations and agent's real actions. And then as reward signals are directly related to the underlying tasks , they are used to make more task-related information encoded irrespective of whether such information is controllable or not. Finally, a temporal coherence constraint is incorporated into the whole framework to reduce the task-irrelevant information in the representations. Experiments on Distracting DeepMind Control Suite and autonomous driving simulator CARLA show that our proposed approach can achieve better performances than some state-of-the-art (SOTA) baselines, which demonstrates the method's effectiveness in enhancing the agent's decision efficiency and overall performances. Code is available at https://github.com/DMU-XMU/Learning-Task-relevant-Representations-via-Rewards-and-Real-Actions-forReinforcement-Learning.git.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] Task-relevant and task-irrelevant variability causally shape error-based motor learning
    Dal'Bello, Lucas Rebelo
    Izawa, Jun
    NEURAL NETWORKS, 2021, 142 : 583 - 596
  • [42] Improving Deep Reinforcement Learning with Intrinsic Rewards via Self-Imitation Learning
    Xu, Mao
    Zhao, Qian
    39TH YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION, YAC 2024, 2024, : 2017 - 2022
  • [43] TASK-RELEVANT INFORMATION, SOCIAL REINFORCEMENT, AND RACE AS FACTORS AFFECTING PERFORMANCE
    TEDESCHI, JT
    LEVY, TM
    CANADIAN JOURNAL OF BEHAVIOURAL SCIENCE, 1971, 3 (02): : 148 - &
  • [44] CaTGrasp: Learning Category-Level Task-Relevant Grasping in Clutter from Simulation
    Wen, Bowen
    Lian, Wenzhao
    Bekris, Kostas
    Schaal, Stefan
    2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 6401 - 6408
  • [45] Learn what matters: cross-domain imitation learning with task-relevant embeddings
    Franzmeyer, Tim
    Torr, Philip H. S.
    Henriques, Joao F.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [46] INCIDENTAL PROBABILITY-LEARNING - EFFECTS OF TASK-RELEVANT VS IRRELEVANT STIMULUS DIMENSIONS
    GELLER, ES
    CLOWER, CM
    BULLETIN OF THE PSYCHONOMIC SOCIETY, 1975, 6 (06) : 649 - 651
  • [47] Learning Action Representations for Reinforcement Learning
    Chandak, Yash
    Theocharous, Georgios
    Kostas, James E.
    Jordan, Scott M.
    Thomas, Philip S.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [48] Modular reinforcement learning:: An application to a real robot task
    Kalmár, Z
    Szepesvári, C
    Lörincz, A
    LEARNING ROBOTS, PROCEEDINGS, 1998, 1545 : 29 - 45
  • [49] Reinforcement Learning With Temporal Logic Rewards
    Li, Xiao
    Vasile, Cristian-Ioan
    Belta, Calin
    2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 3834 - 3839
  • [50] Reinforcement Learning with Multiple Shared Rewards
    Guisi, Douglas M.
    Ribeiro, Richardson
    Teixeira, Marcelo
    Borges, Andre P.
    Enembreck, Fabricio
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE 2016 (ICCS 2016), 2016, 80 : 855 - 864