Global Policy Construction in Modular Reinforcement Learning

被引:0
|
作者
Zhang, Ruohan [1 ]
Song, Zhao [1 ]
Ballard, Dana H. [1 ]
机构
[1] Univ Texas Austin, Dept Comp Sci, 2317 Speedway,Stop D9500, Austin, TX 78712 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a modular reinforcement learning algorithm which decomposes a Markov decision process into independent modules. Each module is trained using Sarsa(lambda). We introduce three algorithms for forming global policy from modules policies, and demonstrate our results using a 2D grid world.
引用
收藏
页码:4226 / 4227
页数:2
相关论文
共 50 条
  • [31] Construction of Polar Codes With Reinforcement Learning
    Liao, Yun
    Hashemi, Seyyed Ali
    Cioffi, John M.
    Goldsmith, Andrea
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2022, 70 (01) : 185 - 198
  • [32] Feature construction for reinforcement learning in hearts
    Sturtevant, Nathan R.
    White, Adam M.
    COMPUTERS AND GAMES, 2007, 4630 : 122 - +
  • [33] Construction of Approximation Spaces for Reinforcement Learning
    Boehmer, Wendelin
    Gruenewaelder, Steffen
    Shen, Yun
    Musial, Marek
    Obermayer, Klaus
    JOURNAL OF MACHINE LEARNING RESEARCH, 2013, 14 : 2067 - 2118
  • [34] Construction of approximation spaces for reinforcement learning
    Neural Information Processing Group, Technische Universität Berlin, Marchstrasse 23, Berlin 10587, Germany
    不详
    不详
    J. Mach. Learn. Res., 2013, (2067-2118):
  • [35] Construction of Polar Codes with Reinforcement Learning
    Liao, Yun
    Hashemi, Seyyed Ali
    Cioffi, John
    Goldsmith, Andrea
    2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
  • [36] Survey on policy reuse in reinforcement learning
    He L.
    Shen L.
    Li H.
    Wang Z.
    Tang W.
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2022, 44 (03): : 884 - 899
  • [37] Expected Policy Gradients for Reinforcement Learning
    Ciosek, Kamil
    Whiteson, Shimon
    JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [38] Policy Optimization for Continuous Reinforcement Learning
    Zhao, Hanyang
    Tang, Wenpin
    Yao, David D.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [39] Policy Consolidation for Continual Reinforcement Learning
    Kaplanis, Christos
    Shanahan, Murray
    Clopath, Claudia
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [40] Policy Reuse in Deep Reinforcement Learning
    Glatt, Ruben
    Helena, Anna
    Costa, Reali
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4929 - 4930