A Scalable Parallel Q-Learning Algorithm for Resource Constrained Decentralized Computing Environments

被引:7
|
作者
Camelo, Miguel [1 ]
Famaey, Jeroen [1 ]
Latre, Steven [1 ]
机构
[1] Univ Antwerp, IMEC, Dept Math & Comp Sci, Middelheimlaan 1, B-2020 Antwerp, Belgium
关键词
D O I
10.1109/MLHPC.2016.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Internet of Things (IoT) is more and more becoming a platform for mission critical applications with stringent requirements in terms of response time and mobility. Therefore, a centralized High Performance Computing (HPC) environment is often not suitable or simply non-existing. Instead, there is a need for a scalable HPC model that supports the deployment of applications on the decentralized but resource constrained devices of the IoT. Recently, Reinforcement Learning (RL) algorithms have been used for decision making within applications by directly interacting with the environment. However, most RL algorithms are designed for centralized environments and are time and resource consuming. Therefore, they are not applicable to such constrained decentralized computing environments. In this paper, we propose a scalable Parallel Q-Learning (PQL) algorithm for resource constrained environments. By combining a table partition strategy together with a co-allocation of both processing and storage, we can significantly reduce the individual resource cost and, at the same time, guarantee convergence and minimize the communication cost. Experimental results show that our algorithm reduces the required training in proportion of the number of Q-Learning agents and, in terms of execution time, it is up to 24 times faster than several well-known PQL algorithms.
引用
收藏
页码:27 / 35
页数:9
相关论文
共 50 条
  • [1] Stateless Q-learning algorithm for service caching in resource constrained edge environment
    Binbin Huang
    Ziqi Ran
    Dongjin Yu
    Yuanyuan Xiang
    Xiaoying Shi
    Zhongjin Li
    Zhengqian Xu
    Journal of Cloud Computing, 12
  • [2] Stateless Q-learning algorithm for service caching in resource constrained edge environment
    Huang, Binbin
    Ran, Ziqi
    Yu, Dongjin
    Xiang, Yuanyuan
    Shi, Xiaoying
    Li, Zhongjin
    Xu, Zhengqian
    JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS, 2023, 12 (01):
  • [3] Implications of Decentralized Q-learning Resource Allocation in Wireless Networks
    Wilhelmi, Francesc
    Bellalta, Boris
    Cano, Cristina
    Jonsson, Anders
    2017 IEEE 28TH ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR, AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2017,
  • [4] I2Q: A Fully Decentralized Q-Learning Algorithm
    Jiang, Jiechuan
    Lu, Zongqing
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [5] Fuzzy Q-learning approach for autonomic resource provisioning of IoT applications in fog computing environments
    Faraji-Mehmandar M.
    Jabbehdari S.
    Javadi H.H.S.
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (04) : 4237 - 4255
  • [6] Selectively Decentralized Q-Learning
    Thanh Nguyen
    Mukhopadhyay, Snehasis
    2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 328 - 333
  • [7] Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning
    Ohnishi, Shota
    Uchibe, Eiji
    Yamaguchi, Yotaro
    Nakanishi, Kosuke
    Yasui, Yuji
    Ishii, Shin
    FRONTIERS IN NEUROROBOTICS, 2019, 13
  • [8] Backward Q-learning: The combination of Sarsa algorithm and Q-learning
    Wang, Yin-Hao
    Li, Tzuu-Hseng S.
    Lin, Chih-Jui
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (09) : 2184 - 2193
  • [9] A shuffled frog-leaping algorithm with Q-learning for unrelated parallel machine scheduling with additional resource and learning effect
    Yi, Tian
    Li, Mingbo
    Lei, Deming
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (03) : 5357 - 5375
  • [10] Decentralized Q-Learning for Uplink Power Control
    Dzulkifly, Sumayyah
    Giupponi, Lorenza
    Said, Fatin
    Dohler, Mischa
    2015 IEEE 20TH INTERNATIONAL WORKSHOP ON COMPUTER AIDED MODELLING AND DESIGN OF COMMUNICATION LINKS AND NETWORKS (CAMAD), 2015, : 54 - 58