A Scalable Parallel Q-Learning Algorithm for Resource Constrained Decentralized Computing Environments

被引:7
|
作者
Camelo, Miguel [1 ]
Famaey, Jeroen [1 ]
Latre, Steven [1 ]
机构
[1] Univ Antwerp, IMEC, Dept Math & Comp Sci, Middelheimlaan 1, B-2020 Antwerp, Belgium
关键词
D O I
10.1109/MLHPC.2016.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Internet of Things (IoT) is more and more becoming a platform for mission critical applications with stringent requirements in terms of response time and mobility. Therefore, a centralized High Performance Computing (HPC) environment is often not suitable or simply non-existing. Instead, there is a need for a scalable HPC model that supports the deployment of applications on the decentralized but resource constrained devices of the IoT. Recently, Reinforcement Learning (RL) algorithms have been used for decision making within applications by directly interacting with the environment. However, most RL algorithms are designed for centralized environments and are time and resource consuming. Therefore, they are not applicable to such constrained decentralized computing environments. In this paper, we propose a scalable Parallel Q-Learning (PQL) algorithm for resource constrained environments. By combining a table partition strategy together with a co-allocation of both processing and storage, we can significantly reduce the individual resource cost and, at the same time, guarantee convergence and minimize the communication cost. Experimental results show that our algorithm reduces the required training in proportion of the number of Q-Learning agents and, in terms of execution time, it is up to 24 times faster than several well-known PQL algorithms.
引用
收藏
页码:27 / 35
页数:9
相关论文
共 50 条
  • [11] Asynchronous Decentralized Q-Learning in Stochastic Games
    Yongacoglu, Bora
    Arslan, Gurdal
    Yuksel, Serdar
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 5008 - 5013
  • [12] Decentralized Q-Learning for Stochastic Teams and Games
    Arslan, Gurdal
    Yuksel, Serdar
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (04) : 1545 - 1558
  • [13] Distributed Q-learning based-Decentralized Resource Allocation for Future Wireless Networks
    Messaoud, Seifeddine
    Bradai, Abbas
    Atri, Mohamed
    PROCEEDINGS OF THE 2020 17TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD 2020), 2020, : 892 - 896
  • [14] Study of Cooperation Strategy of Robot Based on Parallel Q-Learning Algorithm
    Wang, Shuda
    Si, Feng
    Yang, Jing
    Wang, Shuoning
    Yang, Jun
    INTELLIGENT ROBOTICS AND APPLICATIONS, PT I, PROCEEDINGS, 2008, 5314 : 633 - 642
  • [15] Q-learning algorithms with random truncation bounds and applications to effective parallel computing
    Yin, G.
    Xu, C. Z.
    Wang, L. Y.
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2008, 137 (02) : 435 - 451
  • [16] Q-Learning Algorithms with Random Truncation Bounds and Applications to Effective Parallel Computing
    G. Yin
    C. Z. Xu
    L. Y. Wang
    Journal of Optimization Theory and Applications, 2008, 137 : 435 - 451
  • [17] Learning mixed behaviours with parallel Q-Learning
    Laurent, GJ
    Piat, E
    2002 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-3, PROCEEDINGS, 2002, : 1002 - 1007
  • [18] Hysteretic Q-Learning : an algorithm for decentralized reinforcement learning in cooperative multi-agent teams
    Matignon, Laetitia
    Laurent, Guillaume J.
    Le Fort-Piat, Nadine
    2007 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-9, 2007, : 64 - 69
  • [19] Labeling Q-learning in POMDP environments
    Lee, HY
    Kamaya, HY
    Abe, K
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2002, E85D (09) : 1425 - 1432
  • [20] Learning-Driven Decentralized Machine Learning in Resource-Constrained Wireless Edge Computing
    Meng, Zeyu
    Xu, Hongli
    Chen, Min
    Xu, Yang
    Zhao, Yangming
    Qia, Chunming
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2021), 2021,