Finding hidden hierarchy in reinforcement learning

被引:0
|
作者
Poulton, G
Guo, Y
Lu, W
机构
[1] CSIRO, Autonomous Syst Informat & Commun Technol Ctr, Epping, NSW 1710, Australia
[2] Univ New S Wales, Kensington, NSW 2033, Australia
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
HEXQ is a reinforcement learning algorithm that decomposes a problem into subtasks and constructs a hierarchy using state variables. The maximum number of levels is constrained by the number of variables representing a state. In HEXQ, values learned for a subtask can be reused in different contexts if the subtasks are identical. If not, values for non-identical subtasks need to be trained separately. This paper introduces a method that tackles these two restrictions. Experimental results show that this method can save the training time dramatically.
引用
收藏
页码:554 / 561
页数:8
相关论文
共 50 条
  • [41] Hidden state and reinforcement learning with instance-based state identification
    McCallum, RA
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1996, 26 (03): : 464 - 473
  • [42] Automatic Facility Layout Design Using Reinforcement Learning and a Analytic Hierarchy Process
    Ikeda H.
    Nakagawa H.
    Akagi H.
    Sekimoto F.
    Tsuchiya T.
    Journal of Japan Industrial Management Association, 2023, 74 (03) : 142 - 152
  • [43] Reinforcement learning agents with analytic hierarchy process: A case study of pursuit problem
    Katayama, Kengo
    Koshiishi, Takahiro
    Narihisa, Hiroyuki
    Transactions of the Japanese Society for Artificial Intelligence, 2004, 19 (04) : 279 - 291
  • [44] Finding Influencers in Complex Networks: An Effective Deep Reinforcement Learning Approach
    Liu, Changan
    Fan, Changjun
    Zhang, Zhongzhi
    COMPUTER JOURNAL, 2024, 67 (02): : 463 - 473
  • [45] Many-objective stochastic path finding using reinforcement learning
    Tozer, Bentz
    Mazzuchi, Thomas
    Sarkani, Shahram
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 72 : 371 - 382
  • [46] Finding an Optimal Geometric Configuration for TDOA Location Systems With Reinforcement Learning
    Li, Shengxiang
    Liu, Guangyi
    Ding, Siyuan
    Li, Haisi
    Li, Ou
    IEEE ACCESS, 2021, 9 : 63388 - 63397
  • [47] Adaptive stress testing: Finding likely failure events with reinforcement learning
    Lee, Ritchie
    Mengshoel, Ole J.
    Saksena, Anshu
    Gardner, Ryan W.
    Genin, Daniel
    Silbermann, Joshua
    Owen, Michael
    Kochenderfer, Mykel J.
    Journal of Artificial Intelligence Research, 2020, 69 : 1165 - 1201
  • [48] Reinforcement Learning Methods for Finding Equilibria and Tracking Evolution Paths in Conflicts
    Li, Donghua
    Jiang, Ju
    Xu, Haiyan
    Hipel, Keith W.
    2008 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), VOLS 1-6, 2008, : 3291 - +
  • [49] Finding key players in complex networks through deep reinforcement learning
    Changjun Fan
    Li Zeng
    Yizhou Sun
    Yang-Yu Liu
    Nature Machine Intelligence, 2020, 2 : 317 - 324
  • [50] Reinforcement Learning-SLAM for finding minimum cost path and mapping
    Arana-Daniel, Nancy
    Rosales-Ochoa, Roberto
    Lopez-Franco, Carlos
    Nuno, Emmanuel
    2012 WORLD AUTOMATION CONGRESS (WAC), 2012,