A Scalable Parallel Q-Learning Algorithm for Resource Constrained Decentralized Computing Environments

被引:7
|
作者
Camelo, Miguel [1 ]
Famaey, Jeroen [1 ]
Latre, Steven [1 ]
机构
[1] Univ Antwerp, IMEC, Dept Math & Comp Sci, Middelheimlaan 1, B-2020 Antwerp, Belgium
关键词
D O I
10.1109/MLHPC.2016.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Internet of Things (IoT) is more and more becoming a platform for mission critical applications with stringent requirements in terms of response time and mobility. Therefore, a centralized High Performance Computing (HPC) environment is often not suitable or simply non-existing. Instead, there is a need for a scalable HPC model that supports the deployment of applications on the decentralized but resource constrained devices of the IoT. Recently, Reinforcement Learning (RL) algorithms have been used for decision making within applications by directly interacting with the environment. However, most RL algorithms are designed for centralized environments and are time and resource consuming. Therefore, they are not applicable to such constrained decentralized computing environments. In this paper, we propose a scalable Parallel Q-Learning (PQL) algorithm for resource constrained environments. By combining a table partition strategy together with a co-allocation of both processing and storage, we can significantly reduce the individual resource cost and, at the same time, guarantee convergence and minimize the communication cost. Experimental results show that our algorithm reduces the required training in proportion of the number of Q-Learning agents and, in terms of execution time, it is up to 24 times faster than several well-known PQL algorithms.
引用
收藏
页码:27 / 35
页数:9
相关论文
共 50 条
  • [41] Regenerative Braking Algorithm for Parallel Hydraulic Hybrid Vehicles Based on Fuzzy Q-Learning
    Ning, Xiaobin
    Wang, Jiazheng
    Yin, Yuming
    Shangguan, Jiarong
    Bao, Nanxin
    Li, Ning
    ENERGIES, 2023, 16 (04)
  • [42] Decentralized Q-Learning in Zero-sum Markov Games
    Sayin, Muhammed O.
    Zhang, Kaiqing
    Leslie, David S.
    Sar, Tamer Ba Comma
    Ozdaglar, Asuman
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [43] Sample Complexity of Decentralized Tabular Q-Learning for Stochastic Games
    Gao, Zuguang
    Ma, Qianqian
    Basar, Tamer
    Birge, John R.
    2023 AMERICAN CONTROL CONFERENCE, ACC, 2023, : 1098 - 1103
  • [44] Decentralized Q-Learning for Weakly Acyclic Stochastic Dynamic Games
    Arslan, Gurdal
    Yuksel, Serdar
    2015 54TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2015, : 6743 - 6748
  • [45] Parallel Implementation of Reinforcement Learning Q-Learning Technique for FPGA
    Da Silva, Lucileide M. D.
    Torquato, Matheus F.
    Fernandes, Marcelo A. C.
    IEEE ACCESS, 2019, 7 : 2782 - 2798
  • [46] A Decentralized Resource Allocation in Edge Computing for Secure IoT Environments
    Sasikumar, A.
    Ravi, Logesh
    Devarajan, Malathi
    Vairavasundaram, Subramaniyaswamy
    Selvalakshmi, A.
    Kotecha, Ketan
    Abraham, Ajith
    IEEE ACCESS, 2023, 11 : 117177 - 117189
  • [47] A Deep Q-Learning Based UAV Detouring Algorithm in a Constrained Wireless Sensor Network Environment
    Rahman, Shakila
    Akter, Shathee
    Yoon, Seokhoon
    ELECTRONICS, 2025, 14 (01):
  • [48] APPLYING Q-LEARNING TO NON-MARKOVIAN ENVIRONMENTS
    Chizhov, Jurij
    Borisov, Arkady
    ICAART 2009: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, 2009, : 306 - +
  • [49] Switching Q-learning in partially observable Markovian environments
    Kamaya, H
    Lee, H
    Abe, K
    2000 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2000), VOLS 1-3, PROCEEDINGS, 2000, : 1062 - 1067
  • [50] Exponential Moving Average Q-Learning Algorithm
    Awheda, Mostafa D.
    Schwartz, Howard M.
    PROCEEDINGS OF THE 2013 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING (ADPRL), 2013, : 31 - 38