A model-based reinforcement learning approach for maintenance optimization of degrading systems in a large state space

被引:24
|
作者
Zhang, Ping [1 ,2 ]
Zhu, Xiaoyan [1 ]
Xie, Min [2 ,3 ]
机构
[1] Univ Chinese Acad Sci, Sch Econ & Management, Bldg 7,80 Zhongguancun East Rd, Beijing, Peoples R China
[2] City Univ Hong Kong, Dept Syst Engn & Engn Management, Hong Kong, Peoples R China
[3] City Univ Hong Kong, Sch Data Sci, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Maintenance optimization; Periodic inspection; Model-based reinforcement learning; Degrading system; PREDICTIVE MAINTENANCE; DEGRADATION; RELIABILITY; POLICY; ANALYTICS; SUBJECT; PARTS;
D O I
10.1016/j.cie.2021.107622
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Scheduling maintenance tasks based on the deteriorating process has often been established on degradation models. However, the formulas of the degradation processes are usually unknown and hard to be determined for a system working in practices. In this study, we develop a model-based reinforcement learning approach for maintenance optimization. The developed approach determines maintenance actions for each degradation state at each inspection time over a finite planning horizon, supposing that the degradation formula is known or unknown. At each inspection time, the developed approach attempts to learn an optimal assessment value for each maintenance action to be performed at each degradation state. The assessment value quantifies the goodness of each state-action pair in terms of minimizing the accumulated maintenance costs over the planning horizon. To optimize the assessment values when a well-defined degradation formula is known, we customize a Q-learning method with model-based acceleration. When the degradation formula is unknown or hard to be determined, we develop a Dyna-Q method with maintenance-oriented improvements, in which an environment model capturing the degradation pattern under different maintenance actions is learned at first; Then, the assessment values are optimized while considering the stochastic behavior of the system degradation. The final maintenance policy is acquired by performing the maintenance actions associated with the highest assessment values. Experimental studies are presented to illustrate the applications.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Distributionally Robust Model-based Reinforcement Learning with Large State Spaces
    Ramesh, Shyam Sundhar
    Sessa, Pier Giuseppe
    Hu, Yifan
    Krause, Andreas
    Bogunovic, Ilija
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [2] Model-based reinforcement learning approach for federated learning resource allocation and parameter optimization
    Karami, Farzan
    Khalaj, Babak Hossein
    COMPUTER COMMUNICATIONS, 2024, 228
  • [3] A Contraction Approach to Model-based Reinforcement Learning
    Fan, Ting-Han
    Ramadge, Peter J.
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130 : 325 - +
  • [4] On the Importance of Hyperparameter Optimization for Model-based Reinforcement Learning
    Zhang, Baohe
    Rajan, Raghu
    Pineda, Luis
    Lambert, Nathan
    Biedenkapp, Andre
    Chua, Kurtland
    Hutter, Frank
    Calandra, Roberto
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [5] An Efficient Approach to Model-Based Hierarchical Reinforcement Learning
    Li, Zhuoru
    Narayan, Akshay
    Leong, Tze-Yun
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3583 - 3589
  • [6] A Model-based Factored Bayesian Reinforcement Learning Approach
    Wu, Bo
    Feng, Yanpeng
    Zheng, Hongyan
    APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 1092 - 1095
  • [7] Model-based inverse reinforcement learning for deterministic systems
    Self, Ryan
    Abudia, Moad
    Mahmud, S. M. Nahid
    Kamalapurkar, Rushikesh
    AUTOMATICA, 2022, 140
  • [8] Model-Based Reinforcement Learning for Quantized Federated Learning Performance Optimization
    Yang, Nuocheng
    Wang, Sihua
    Chen, Mingzhe
    Brinton, Christopher G.
    Yin, Changchuan
    Saad, Walid
    Cui, Shuguang
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 5063 - 5068
  • [9] Efficient hyperparameter optimization through model-based reinforcement learning
    Wu, Jia
    Chen, SenPeng
    Liu, XiYuan
    NEUROCOMPUTING, 2020, 409 : 381 - 393
  • [10] Model-Based Reinforcement Learning Method for Microgrid Optimization Scheduling
    Yao, Jinke
    Xu, Jiachen
    Zhang, Ning
    Guan, Yajuan
    SUSTAINABILITY, 2023, 15 (12)