Finding hidden hierarchy in reinforcement learning

被引：0

作者：

Poulton, G

Guo, Y

Lu, W

机构：

[1] CSIRO, Autonomous Syst Informat & Commun Technol Ctr, Epping, NSW 1710, Australia

[2] Univ New S Wales, Kensington, NSW 2033, Australia

来源：

KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 3, PROCEEDINGS | 2005年 / 3683卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

HEXQ is a reinforcement learning algorithm that decomposes a problem into subtasks and constructs a hierarchy using state variables. The maximum number of levels is constrained by the number of variables representing a state. In HEXQ, values learned for a subtask can be reused in different contexts if the subtasks are identical. If not, values for non-identical subtasks need to be trained separately. This paper introduces a method that tackles these two restrictions. Experimental results show that this method can save the training time dramatically.

引用

页码：554 / 561

页数：8

共 50 条

[41] Hidden state and reinforcement learning with instance-based state identification
McCallum, RA
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1996, 26 (03): : 464 - 473
[42] Automatic Facility Layout Design Using Reinforcement Learning and a Analytic Hierarchy Process
Ikeda H.
Nakagawa H.
Akagi H.
Sekimoto F.
Tsuchiya T.
Journal of Japan Industrial Management Association, 2023, 74 (03) : 142 - 152
[43] Reinforcement learning agents with analytic hierarchy process: A case study of pursuit problem
Katayama, Kengo
Koshiishi, Takahiro
Narihisa, Hiroyuki
Transactions of the Japanese Society for Artificial Intelligence, 2004, 19 (04) : 279 - 291
[44] Finding Influencers in Complex Networks: An Effective Deep Reinforcement Learning Approach
Liu, Changan
Fan, Changjun
Zhang, Zhongzhi
COMPUTER JOURNAL, 2024, 67 (02): : 463 - 473
[45] Many-objective stochastic path finding using reinforcement learning
Tozer, Bentz
Mazzuchi, Thomas
Sarkani, Shahram
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 72 : 371 - 382
[46] Finding an Optimal Geometric Configuration for TDOA Location Systems With Reinforcement Learning
Li, Shengxiang
Liu, Guangyi
Ding, Siyuan
Li, Haisi
Li, Ou
IEEE ACCESS, 2021, 9 : 63388 - 63397
[47] Adaptive stress testing: Finding likely failure events with reinforcement learning
Lee, Ritchie
Mengshoel, Ole J.
Saksena, Anshu
Gardner, Ryan W.
Genin, Daniel
Silbermann, Joshua
Owen, Michael
Kochenderfer, Mykel J.
Journal of Artificial Intelligence Research, 2020, 69 : 1165 - 1201
[48] Reinforcement Learning Methods for Finding Equilibria and Tracking Evolution Paths in Conflicts
Li, Donghua
Jiang, Ju
Xu, Haiyan
Hipel, Keith W.
2008 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), VOLS 1-6, 2008, : 3291 - +
[49] Finding key players in complex networks through deep reinforcement learning
Changjun Fan
Li Zeng
Yizhou Sun
Yang-Yu Liu
Nature Machine Intelligence, 2020, 2 : 317 - 324
[50] Reinforcement Learning-SLAM for finding minimum cost path and mapping
Arana-Daniel, Nancy
Rosales-Ochoa, Roberto
Lopez-Franco, Carlos
Nuno, Emmanuel
2012 WORLD AUTOMATION CONGRESS (WAC), 2012,

← 1 2 3 4 5 →