Swarm Reinforcement Learning Method Based on Hierarchical Q-Learning

被引：0

作者：

Kuroe, Yasuaki ^{[1
]}

Takeuchi, Kenya ^{[1
]}

Maeda, Yutaka ^{[1
]}

机构：

[1] Kansai Univ, Fac Engn Sci, Suita, Osaka, Japan

来源：

2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021) | 2021年

基金：

日本学术振兴会;

关键词：

reinforcement learning method; partially observed Markov decision process; hierarchical Q-learning; swarm intelligence;

D O I：

10.1109/SSCI50451.2021.9659877

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In last decades the reinforcement learning method has attracted a great deal of attention and many studies have been done. However, this method is basically a trial-and-error scheme and it takes much computational time to acquire optimal strategies. Furthermore, optimal strategies may not be obtained for large and complicated problems with many states. To resolve these problems we have proposed the swarm reinforcement learning method, which is developed inspired by the multi-point search optimization methods. The Swarm reinforcement learning method has been extensively studied and its effectiveness has been confirmed for several problems, especially for Markov decision processes where the agents can fully observe the states of environments. In many real-world problems, however, the agents cannot fully observe the environments and they are usually partially observable Markov decision processes (POMDPs). The purpose of this paper is to develop a swarm reinforcement learning method which can deal with POMDPs. We propose a swarm reinforcement learning method based on HQ-learning, which is a hierarchical extension of Q-learning. It is shown through experiments that the proposed method can handle POMDPs and possesses higher performance than that of the original HQ-learning.

引用

页数：8

共 50 条

[21] Parallel Implementation of Reinforcement Learning Q-Learning Technique for FPGA
Da Silva, Lucileide M. D.
Torquato, Matheus F.
Fernandes, Marcelo A. C.
IEEE ACCESS, 2019, 7 : 2782 - 2798
[22] Concurrent Q-learning: Reinforcement learning for dynamic goals and environments
Ollington, RB
Vamplew, PW
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2005, 20 (10) : 1037 - 1052
[23] Constraints Penalized Q-learning for Safe Offline Reinforcement Learning
Xu, Haoran
Zhan, Xianyuan
Zhu, Xiangyu
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 8753 - 8760
[24] Deep Reinforcement Learning with Sarsa and Q-Learning: A Hybrid Approach
Xu, Zhi-xiong
Cao, Lei
Chen, Xi-liang
Li, Chen-xi
Zhang, Yong-liang
Lai, Jun
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (09) : 2315 - 2322
[25] Nested Q-learning of hierarchical control structures
Digney, BL
ICNN - 1996 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS. 1-4, 1996, : 161 - 166
[26] An Enhanced Ensemble Learning Method for Sentiment Analysis based on Q-learning
Savargiv, Mohammad
Masoumi, Behrooz
Keyvanpour, Mohammad Reza
IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY-TRANSACTIONS OF ELECTRICAL ENGINEERING, 2024, 48 (03) : 1261 - 1277
[27] Enhanced Machine Learning Algorithms: Deep Learning, Reinforcement Learning, ana Q-Learning
Park, Ji Su
Park, Jong Hyuk
JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2020, 16 (05): : 1001 - 1007
[28] Nested Q-learning of hierarchical control structures
Digney, BL
ICNN - 1996 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS. 1-4, 1996, : 1676 - 1681
[29] Linear quadratic optimal control method based on output feedback inverse reinforcement Q-learning
Liu, Wen
Fan, Jia-Lu
Xue, Wen-Qian
Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2024, 41 (08): : 1469 - 1479
[30] Decision-making method for vehicle longitudinal automatic driving based on reinforcement Q-learning
Gao, Zhenhai
Sun, Tianjun
Xiao, Hongwei
INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2019, 16 (03):

← 1 2 3 4 5 →