Meta-Reinforcement Learning Algorithm Based on Reward and Dynamic Inference

被引:0
|
作者
Chen, Jinhao [1 ]
Zhang, Chunhong [2 ]
Hu, Zheng [1 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100088, Peoples R China
[2] Beijing Univ Posts & Telecommun, Key Lab Universal Wireless Commun, Minist Educ, Beijing 100088, Peoples R China
关键词
Meta-Reinforcement Learning; Variational Inference; Hidden Feature;
D O I
10.1007/978-981-97-2259-4_17
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Meta-Reinforcement Learning aims to rapidly address unseen tasks that share similar structures. However, the agent heavily relies on a large amount of experience during the meta-training phase, presenting a formidable challenge in achieving high sample efficiency. Current methods typically adapt to novel tasks within the Meta-Reinforcement Learning framework through task inference. Unfortunately, these approaches still exhibit limitations when faced with highcomplexity task space. In this paper, we propose a Meta-Reinforcement Learning method based on reward and dynamic inference. We introduce independent reward and dynamic inference encoders, which sample specific context information to capture the deep-level features of task goals and dynamics. By reducing task inference space, agent effectively learns the shared structures across tasks and acquires a profound understanding of the task differences. We illustrate the performance degradation caused by the high task inference complexity and demonstrate that our method outperforms previous algorithms in terms of sample efficiency.
引用
收藏
页码:223 / 234
页数:12
相关论文
共 50 条
  • [1] A Federated Meta-Reinforcement Learning Algorithm Based on Gradient Correction
    Qin, Zerui
    Yue, Sheng
    PROCEEDINGS OF THE ACM TURING AWARD CELEBRATION CONFERENCE-CHINA 2024, ACM-TURC 2024, 2024, : 220 - 221
  • [2] A Meta-Reinforcement Learning Algorithm for Causal Discovery
    Sauter, Andreas
    Acar, Erman
    Francois-Lavet, Vincent
    CONFERENCE ON CAUSAL LEARNING AND REASONING, VOL 213, 2023, 213 : 602 - 619
  • [3] Meta-Reinforcement Learning With Dynamic Adaptiveness Distillation
    Hu, Hangkai
    Huang, Gao
    Li, Xiang
    Song, Shiji
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (03) : 1454 - 1464
  • [4] Dynamic Channel Access via Meta-Reinforcement Learning
    Lu, Ziyang
    Gursoy, M. Cenk
    2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
  • [5] Off-Policy Meta-Reinforcement Learning With Belief-Based Task Inference
    Imagawa, Takahisa
    Hiraoka, Takuya
    Tsuruoka, Yoshimasa
    IEEE ACCESS, 2022, 10 : 49494 - 49507
  • [6] Hypernetworks in Meta-Reinforcement Learning
    Beck, Jacob
    Jackson, Matthew
    Vuorio, Risto
    Whiteson, Shimon
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 1478 - 1487
  • [7] Meta-Reinforcement Learning in Non-Stationary and Dynamic Environments
    Bing, Zhenshan
    Lerch, David
    Huang, Kai
    Knoll, Alois
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3476 - 3491
  • [8] Model-based Adversarial Meta-Reinforcement Learning
    Lin, Zichuan
    Thomas, Garrett
    Yang, Guangwen
    Ma, Tengyu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [9] Model-Based Meta-reinforcement Learning for Hyperparameter Optimization
    Albrechts, Jeroen
    Martin, Hugo M.
    Tavakol, Maryam
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2024, PT I, 2025, 15346 : 27 - 39
  • [10] Meta-Reinforcement Learning Based Resource Allocation for Dynamic V2X Communications
    Yuan, Yi
    Zheng, Gan
    Wong, Kai-Kit
    Letaief, Khaled B.
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2021, 70 (09) : 8964 - 8977