Reliability-aware failure recovery for cloud computing based automatic train supervision systems in urban rail transit using deep reinforcement learning

被引:3
|
作者
Zhu, Li [1 ]
Zhuang, Qingheng [1 ]
Jiang, Hailin [1 ]
Liang, Hao [1 ]
Gao, Xinjun [2 ]
Wang, Wei [3 ]
机构
[1] Beijing Jiaotong Univ, State Key Lab Rail Traff Control & Safety, Beijing, Peoples R China
[2] China Acad Railway Sci Grp Co Ltd, Signal & Commun Res Inst, Beijing 100081, Peoples R China
[3] Traff Control Technol Co Ltd, Beijing, Peoples R China
关键词
ATS; Cloud computing; Urban rail transit; Reliability; Failure recovery;
D O I
10.1186/s13677-023-00502-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As urban rail transit construction advances with information technology, modernization, information, and intelligence have become the direction of development. A growing number of cloud platforms are being developed for transit in urban areas. However, the increasing scale of urban rail cloud platforms, coupled with the deployment of urban rail safety applications on the cloud platform, present a huge challenge to cloud reliability.One of the key components of urban rail transit cloud platforms is Automatic Train Supervision (ATS). The failure of the ATS cloud service would result in less punctual trains and decreased traffic efficiency, making it essential to research fault tolerance methods based on cloud computing to improve the reliability of ATS cloud services. This paper proposes a proactive, reliability-aware failure recovery method for ATS cloud services based on reinforcement learning. We formulate the problem of penalty error decision and resource-efficient optimization using the advanced actor-critic (A2C) algorithm. To maintain the freshness of the information, we use Age of Information (AoI) to train the agent, and construct the agent using Long Short-Term Memory (LSTM) to improve its sensitivity to fault events. Simulation results demonstrate that our proposed approach, LSTM-A2C, can effectively identify and correct faults in ATS cloud services, improving service reliability.
引用
收藏
页数:14
相关论文
共 37 条
  • [1] Reliability-aware failure recovery for cloud computing based automatic train supervision systems in urban rail transit using deep reinforcement learning
    Li Zhu
    Qingheng Zhuang
    Hailin Jiang
    Hao Liang
    Xinjun Gao
    Wei Wang
    Journal of Cloud Computing, 12
  • [2] A Deep Reinforcement Learning based Resource Allocation Method for Urban Rail Transit Cloud Systems
    Li, Ziheng
    Zhu, Li
    Li, Yang
    Liang, Hao
    Wang, Hao
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 3922 - 3926
  • [3] Deep Reinforcement Learning based Reliability-aware Resource Placement and Task Offloading in Edge Computing
    Liang, Jingyu
    Feng, Zihan
    Gao, Han
    Chen, Ying
    Huang, Jiwei
    Truong, Hong-Linh
    2024 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES, ICWS 2024, 2024, : 697 - 706
  • [4] Reliability-Aware: Task Scheduling in Cloud Computing Using Multi-Agent Reinforcement Learning Algorithm and Neural Fitted Q
    Balla, Husamelddin
    Sheng, Chen
    Jing Weipeng
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2021, 18 (01) : 36 - 47
  • [5] A Framework for Automatic Failure Recovery in ICT Systems by Deep Reinforcement Learning
    Ikeuchi, Hiroki
    Ge, Jiawen
    Matsuo, Yoichi
    Watanabe, Keishiro
    2020 IEEE 40TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2020, : 1310 - 1315
  • [6] Multi-step look ahead deep reinforcement learning approach for automatic train regulation of urban rail transit lines with energy-saving
    Zhang, Yunfeng
    Li, Shukai
    Yuan, Yin
    Yang, Lixing
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 145
  • [7] Deep-Reinforcement-Learning-Based Energy Management Strategy for Supercapacitor Energy Storage Systems in Urban Rail Transit
    Yang, Zhongping
    Zhu, Feiqin
    Lin, Fei
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (02) : 1150 - 1160
  • [8] Research on speed sensor fusion of urban rail transit train speed ranging based on deep learning
    Zhan, Xuemei
    Mu, Zhong Hua
    Kumar, Rajeev
    Shabaz, Mohammad
    NONLINEAR ENGINEERING - MODELING AND APPLICATION, 2021, 10 (01): : 363 - 373
  • [9] Collaborative Computing Optimization in Train-Edge-Cloud-Based Smart Train Systems Using Risk-Sensitive Reinforcement Learning
    Zhu, Li
    Lin, Sen
    Yu, F. Richard
    Li, Yang
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (03) : 3129 - 3141
  • [10] Cloud Computing Based Demand Response Management Using Deep Reinforcement Learning
    Song, Chunhe
    Han, Guangjie
    Zeng, Peng
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2022, 10 (01) : 72 - 81