Swarm Reinforcement Learning Method Based on Hierarchical Q-Learning

被引:0
|
作者
Kuroe, Yasuaki [1 ]
Takeuchi, Kenya [1 ]
Maeda, Yutaka [1 ]
机构
[1] Kansai Univ, Fac Engn Sci, Suita, Osaka, Japan
来源
2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021) | 2021年
基金
日本学术振兴会;
关键词
reinforcement learning method; partially observed Markov decision process; hierarchical Q-learning; swarm intelligence;
D O I
10.1109/SSCI50451.2021.9659877
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In last decades the reinforcement learning method has attracted a great deal of attention and many studies have been done. However, this method is basically a trial-and-error scheme and it takes much computational time to acquire optimal strategies. Furthermore, optimal strategies may not be obtained for large and complicated problems with many states. To resolve these problems we have proposed the swarm reinforcement learning method, which is developed inspired by the multi-point search optimization methods. The Swarm reinforcement learning method has been extensively studied and its effectiveness has been confirmed for several problems, especially for Markov decision processes where the agents can fully observe the states of environments. In many real-world problems, however, the agents cannot fully observe the environments and they are usually partially observable Markov decision processes (POMDPs). The purpose of this paper is to develop a swarm reinforcement learning method which can deal with POMDPs. We propose a swarm reinforcement learning method based on HQ-learning, which is a hierarchical extension of Q-learning. It is shown through experiments that the proposed method can handle POMDPs and possesses higher performance than that of the original HQ-learning.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Improving the efficiency of reinforcement learning for a spacecraft powered descent with Q-learning
    Callum Wilson
    Annalisa Riccardi
    Optimization and Engineering, 2023, 24 : 223 - 255
  • [42] Adaptive Reinforcement Q-Learning Algorithm for Swarm-Robot System using Pheromone Mechanism
    Shi, Zhiguo
    Tu, Jun
    Li, Yuankai
    Wang, Zeying
    2013 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO), 2013, : 952 - 957
  • [43] DDNSAS: Deep reinforcement learning based deep Q-learning network for smart agriculture system
    Devarajan, Ganesh Gopal
    Nagarajan, Senthil Murugan
    Ramana, T. V.
    Vignesh, T.
    Ghosh, Uttam
    Alnumay, Waleed
    SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2023, 39
  • [44] Dueling double Q-learning based reinforcement learning approach for the flow shop scheduling problem
    Kim S.J.
    Kim B.W.
    Transactions of the Korean Institute of Electrical Engineers, 2021, 70 (10): : 1497 - 1508
  • [45] Reinforcement Learning based optimisation of heating curve Flow temperature adjustment by means of Q-learning
    Huang, Chenzi
    Seidel, Stephan
    Pruvost, Herve
    Brauenig, Jan
    ATP MAGAZINE, 2023, (04): : 70 - 77
  • [46] Inverted pendulum control of double q-learning reinforcement learning algorithm based on neural network
    Zhang, Daode
    Wang, Xiaolong
    Li, Xuesheng
    Wang, Dong
    UPB Scientific Bulletin, Series D: Mechanical Engineering, 2020, 82 (02): : 15 - 26
  • [47] Tabular Q-learning Based Reinforcement Learning Agent for Autonomous Vehicle Drift Initiation and Stabilization
    Toth, Szilard H.
    Bardos, Adam
    Viharos, Zsolt J.
    IFAC PAPERSONLINE, 2023, 56 (02): : 4896 - 4903
  • [48] A novel multi-step Q-learning method to improve data efficiency for deep reinforcement learning
    Yuan, Yinlong
    Yu, Zhu Liang
    Gu, Zhenghui
    Yeboah, Yao
    Wei, Wu
    Deng, Xiaoyan
    Li, Jingcong
    Li, Yuanqing
    KNOWLEDGE-BASED SYSTEMS, 2019, 175 : 107 - 117
  • [49] Reinforcement distribution in a team of cooperative Q-learning agents
    Abbasi, Zahra
    Abbasi, Mohammad Ali
    PROCEEDINGS OF NINTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING, 2008, : 154 - +
  • [50] The Sample Complexity of Teaching-by-Reinforcement on Q-Learning
    Zhang, Xuezhou
    Bharti, Shubham Kumar
    Ma, Yuzhe
    Singla, Adish
    Zhu, Xiaojin
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10939 - 10947