Improving reliability in resource management through adaptive reinforcement learning for distributed systems

被引:17
|
作者
Hussin, Masnida [1 ]
Hamid, Nor Asilah Wati Abdul [1 ]
Kasmiran, Khairul Azhar [2 ]
机构
[1] Univ Putra Malaysia, Fac Comp Sci & IT, Dept Commun Technol & Networks, Serdang 43400, Selangor, Malaysia
[2] Univ Putra Malaysia, Fac Comp Sci & IT, Dept Comp Sci, Serdang 43400, Selangor, Malaysia
关键词
Distributed systems; Resource management; Adaptive reinforcement learning; System reliability; Computational complexity;
D O I
10.1016/j.jpdc.2014.10.001
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Demands on capacity of distributed systems (e.g., Grid and Cloud) play a crucial role in today's information era due to the growing scale of the systems. While the distributed systems provide a vast amount of computing power their reliability is often hard to be guaranteed. This paper presents effective resource management using adaptive reinforcement learning (RL) that focuses on improving successful execution with low-computational-complexity. The approach uses an emerging methodology of RL in conjunction with neural network to help a scheduler that effectively observes and adapts to dynamic changes in execution environments. The observation of environment at various learning stages that normalize by resource-aware availability and feedback-based scheduling significantly brings the environments closer to the optimal solutions. Our approach also solves a high computational complexity in RL system through on-demand information sharing. Results from our extensive simulations demonstrate the effectiveness of adaptive RL for improving system reliability. (C) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:93 / 100
页数:8
相关论文
共 50 条
  • [1] Resource Management in Distributed SDN Using Reinforcement Learning
    Ma, Liang
    Zhang, Ziyao
    Ko, Bongjun
    Srivatsa, Mudhakar
    Leung, Kin K.
    [J]. GROUND/AIR MULTISENSOR INTEROPERABILITY, INTEGRATION, AND NETWORKING FOR PERSISTENT ISR IX, 2018, 10635
  • [2] Adaptive fusion by reinforcement learning for distributed detection systems
    Ansari, N
    Hou, ESH
    Zhu, BO
    Chen, JG
    [J]. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 1996, 32 (02) : 524 - 531
  • [3] Reinforcement Learning Based Approaches to Adaptive Context Caching in Distributed Context Management Systems
    Weerasinghe, Shakthi
    Zaslavsky, Arkady
    Loke, Seng W.
    Medvedev, Alexey
    Abken, Amin
    Hassani, Alireza
    Hassani, Alireza
    Huang, Guang-Li
    [J]. ACM TRANSACTIONS ON INTERNET OF THINGS, 2024, 5 (02):
  • [4] On Self-adaptive Resource Allocation through Reinforcement Learning
    Panerati, Jacopo
    Sironi, Filippo
    Carminati, Matteo
    Maggio, Martina
    Beltrame, Giovanni
    Gmytrasiewicz, Piotr J.
    Sciuto, Donatella
    Santambrogio, Marco D.
    [J]. 2013 NASA/ESA CONFERENCE ON ADAPTIVE HARDWARE AND SYSTEMS (AHS), 2013, : 23 - 30
  • [5] Distributed resource management in wireless sensor networks using reinforcement learning
    Kunal Shah
    Mario Di Francesco
    Mohan Kumar
    [J]. Wireless Networks, 2013, 19 : 705 - 724
  • [6] Distributed resource management in wireless sensor networks using reinforcement learning
    Shah, Kunal
    Di Francesco, Mario
    Kumar, Mohan
    [J]. WIRELESS NETWORKS, 2013, 19 (05) : 705 - 724
  • [7] Reinforcement Learning-based Adaptive Resource Management of Differentiated Services in Geo-distributed Data Centers
    Zhou, Xiaojie
    Wang, Kun
    Jia, Weijia
    Guo, Minyi
    [J]. 2017 IEEE/ACM 25TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2017,
  • [8] Enhancing Dynamic Production Scheduling and Resource Allocation Through Adaptive Control Systems with Deep Reinforcement Learning
    Aderoba, Olugbenga Adegbemisola
    Mpofu, Kluunbu Ani
    Adenuga, Olukorede Tijani
    Nzengue, Alliance Gracia Bibili
    [J]. PROCEEDINGS OF THE CONFERENCE ON PRODUCTION SYSTEMS AND LOGISTICS, CPSL 2024, 2024, : 814 - 827
  • [9] Adaptive Multiagent Model Based on Reinforcement Learning for Distributed Generation Systems
    Divenyi, Daniel
    Dan, Andras
    [J]. 2012 23RD INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2012, : 303 - 307
  • [10] Improving Autonomous Separation Assurance through Distributed Reinforcement Learning with Attention Networks
    Brittain, Marc W.
    Alvarez, Luis E.
    Breeden, Kara
    [J]. THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 22857 - 22863