A Scalable Parallel Q-Learning Algorithm for Resource Constrained Decentralized Computing Environments

被引:7
|
作者
Camelo, Miguel [1 ]
Famaey, Jeroen [1 ]
Latre, Steven [1 ]
机构
[1] Univ Antwerp, IMEC, Dept Math & Comp Sci, Middelheimlaan 1, B-2020 Antwerp, Belgium
关键词
D O I
10.1109/MLHPC.2016.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Internet of Things (IoT) is more and more becoming a platform for mission critical applications with stringent requirements in terms of response time and mobility. Therefore, a centralized High Performance Computing (HPC) environment is often not suitable or simply non-existing. Instead, there is a need for a scalable HPC model that supports the deployment of applications on the decentralized but resource constrained devices of the IoT. Recently, Reinforcement Learning (RL) algorithms have been used for decision making within applications by directly interacting with the environment. However, most RL algorithms are designed for centralized environments and are time and resource consuming. Therefore, they are not applicable to such constrained decentralized computing environments. In this paper, we propose a scalable Parallel Q-Learning (PQL) algorithm for resource constrained environments. By combining a table partition strategy together with a co-allocation of both processing and storage, we can significantly reduce the individual resource cost and, at the same time, guarantee convergence and minimize the communication cost. Experimental results show that our algorithm reduces the required training in proportion of the number of Q-Learning agents and, in terms of execution time, it is up to 24 times faster than several well-known PQL algorithms.
引用
收藏
页码:27 / 35
页数:9
相关论文
共 50 条
  • [31] Labeling Q-learning in hidden state environments
    Hae-Yeon Lee
    Hiroyuki Kamaya
    Ken-ichi Abe
    Artificial Life and Robotics, 2002, 6 (4) : 181 - 184
  • [32] ENHANCEMENTS OF FUZZY Q-LEARNING ALGORITHM
    Glowaty, Grzegorz
    COMPUTER SCIENCE-AGH, 2005, 7 : 77 - 87
  • [33] An analysis of the pheromone Q-learning algorithm
    Monekosso, N
    Remagnino, P
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2002, PROCEEDINGS, 2002, 2527 : 224 - 232
  • [34] A Weighted Smooth Q-Learning Algorithm
    Vijesh, V. Antony
    Shreyas, S. R.
    IEEE CONTROL SYSTEMS LETTERS, 2025, 9 : 21 - 26
  • [35] An improved immune Q-learning algorithm
    Ji, Zhengqiao
    Wu, Q. M. Jonathan
    Sid-Ahmed, Maher
    2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 3330 - +
  • [36] Multi-Agent Cooperation Q-Learning Algorithm Based on Constrained Markov Game
    Ge, Yangyang
    Zhu, Fei
    Huang, Wei
    Zhao, Peiyao
    Liu, Quan
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2020, 17 (02) : 647 - 664
  • [37] GUI Testing to the Power of Parallel Q-Learning
    Mobilio, Marco
    Clerissi, Diego
    Denaro, Giovanni
    Mariani, Leonardo
    2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS, ICSE-COMPANION, 2023, : 55 - 59
  • [38] Q-learning based hyper-heuristic algorithm for solving multi-mode resource-constrained project scheduling problem
    Cui J.
    Lyu Y.
    Xu Z.
    Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2022, 28 (05): : 1472 - 1481
  • [39] Concurrent Q-learning: Reinforcement learning for dynamic goals and environments
    Ollington, RB
    Vamplew, PW
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2005, 20 (10) : 1037 - 1052
  • [40] Autonomous Decentralized Traffic Control Using Q-Learning in LPWAN
    Kaburaki, Aoto
    Adachi, Koichi
    Takyu, Osamu
    Ohta, Mai
    Fujii, Takeo
    IEEE ACCESS, 2021, 9 : 93651 - 93661