Global Policy Construction in Modular Reinforcement Learning

被引:0
|
作者
Zhang, Ruohan [1 ]
Song, Zhao [1 ]
Ballard, Dana H. [1 ]
机构
[1] Univ Texas Austin, Dept Comp Sci, 2317 Speedway,Stop D9500, Austin, TX 78712 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a modular reinforcement learning algorithm which decomposes a Markov decision process into independent modules. Each module is trained using Sarsa(lambda). We introduce three algorithms for forming global policy from modules policies, and demonstrate our results using a 2D grid world.
引用
收藏
页码:4226 / 4227
页数:2
相关论文
共 50 条
  • [21] Modular inverse reinforcement learning for visuomotor behavior
    Rothkopf, Constantin A.
    Ballard, Dana H.
    BIOLOGICAL CYBERNETICS, 2013, 107 (04) : 477 - 490
  • [22] Adaptive module acquisition in modular reinforcement learning
    Murao, H
    Kamitsuji, T
    Kitamura, S
    KNOWLEDGE-BASED INTELLIGENT INFORMATION ENGINEERING SYSTEMS & ALLIED TECHNOLOGIES, PTS 1 AND 2, 2001, 69 : 318 - 323
  • [23] Modular Reinforcement Learning Framework for Learners and Educators
    Versaw, Rachael
    Schultz, Samantha
    Lu, Kevin
    Zhao, Richard
    PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON THE FOUNDATIONS OF DIGITAL GAMES, FDG 2021, 2021,
  • [24] Hebbian reinforcement learning in a modular dynamic network
    Daucé, E
    From Animals to Animats 8, 2004, : 305 - 314
  • [25] Student-t policy in reinforcement learning to acquire global optimum of robot control
    Taisuke Kobayashi
    Applied Intelligence, 2019, 49 : 4335 - 4347
  • [26] Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning
    Zhang, Yizhou
    Qu, Guannan
    Xu, Pan
    Lin, Yiheng
    Chen, Zaiwei
    Wierman, Adam
    PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2023, 7 (01)
  • [27] Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning
    Zhang Y.
    Qu G.
    Xu P.
    Lin Y.
    Chen Z.
    Wierman A.
    Performance Evaluation Review, 2023, 51 (01): : 83 - 84
  • [28] Student-t policy in reinforcement learning to acquire global optimum of robot control
    Kobayashi, Taisuke
    APPLIED INTELLIGENCE, 2019, 49 (12) : 4335 - 4347
  • [29] MODULAR LEARNING FOR POLICY-STUDIES
    FAIRCHILD, ES
    POLICY STUDIES JOURNAL, 1978, 6 (03) : 341 - 348
  • [30] Learning Global Optimization by Deep Reinforcement Learning
    da Silva Filho, Moesio Wenceslau
    Barbosa, Gabriel A.
    Miranda, Pericles B. C.
    INTELLIGENT SYSTEMS, PT II, 2022, 13654 : 417 - 433