A model-based reinforcement learning approach for maintenance optimization of degrading systems in a large state space

被引:24
|
作者
Zhang, Ping [1 ,2 ]
Zhu, Xiaoyan [1 ]
Xie, Min [2 ,3 ]
机构
[1] Univ Chinese Acad Sci, Sch Econ & Management, Bldg 7,80 Zhongguancun East Rd, Beijing, Peoples R China
[2] City Univ Hong Kong, Dept Syst Engn & Engn Management, Hong Kong, Peoples R China
[3] City Univ Hong Kong, Sch Data Sci, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Maintenance optimization; Periodic inspection; Model-based reinforcement learning; Degrading system; PREDICTIVE MAINTENANCE; DEGRADATION; RELIABILITY; POLICY; ANALYTICS; SUBJECT; PARTS;
D O I
10.1016/j.cie.2021.107622
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Scheduling maintenance tasks based on the deteriorating process has often been established on degradation models. However, the formulas of the degradation processes are usually unknown and hard to be determined for a system working in practices. In this study, we develop a model-based reinforcement learning approach for maintenance optimization. The developed approach determines maintenance actions for each degradation state at each inspection time over a finite planning horizon, supposing that the degradation formula is known or unknown. At each inspection time, the developed approach attempts to learn an optimal assessment value for each maintenance action to be performed at each degradation state. The assessment value quantifies the goodness of each state-action pair in terms of minimizing the accumulated maintenance costs over the planning horizon. To optimize the assessment values when a well-defined degradation formula is known, we customize a Q-learning method with model-based acceleration. When the degradation formula is unknown or hard to be determined, we develop a Dyna-Q method with maintenance-oriented improvements, in which an environment model capturing the degradation pattern under different maintenance actions is learned at first; Then, the assessment values are optimized while considering the stochastic behavior of the system degradation. The final maintenance policy is acquired by performing the maintenance actions associated with the highest assessment values. Experimental studies are presented to illustrate the applications.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] A Safety Aware Model-Based Reinforcement Learning Framework for Systems with Uncertainties
    Mahmud, S. M. Nahid
    Hareland, Katrine
    Nivison, Scott A.
    Bell, Zachary, I
    Kamalapurkar, Rushikesh
    2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, : 1979 - 1984
  • [42] Model-Based Reinforcement Learning in Multiagent Systems with Sequential Action Selection
    Akramizadeh, Ali
    Afshar, Ahmad
    Menhaj, Mohammad Bagher
    Jafari, Samira
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (02): : 255 - 263
  • [43] Joint optimization of preventive maintenance and production scheduling for multi-state production systems based on reinforcement learning
    Yang, Hongbing
    Li, Wenchao
    Wang, Bin
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2021, 214
  • [44] Dynamic selective maintenance optimization for multi-state systems over a finite horizon: A deep reinforcement learning approach
    Liu, Yu
    Chen, Yiming
    Jiang, Tao
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2020, 283 (01) : 166 - 181
  • [45] Opportunities for model-based learning systems in the human exploration of space
    Clancey, B
    INTELLIGENT TUTORING SYSTEMS, PROCEEDINGS, 2004, 3220 : 901 - 901
  • [46] A model-based deep reinforcement learning approach to the nonblocking coordination of modular supervisors of discrete event systems
    Yang, Junjun
    Tan, Kaige
    Feng, Lei
    Li, Zhiwu
    INFORMATION SCIENCES, 2023, 630 : 305 - 321
  • [47] Learning to Paint With Model-based Deep Reinforcement Learning
    Huang, Zhewei
    Heng, Wen
    Zhou, Shuchang
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8708 - 8717
  • [48] Objective Mismatch in Model-based Reinforcement Learning
    Lambert, Nathan
    Amos, Brandon
    Yadan, Omry
    Calandra, Roberto
    LEARNING FOR DYNAMICS AND CONTROL, VOL 120, 2020, 120 : 761 - 770
  • [49] Model-based reinforcement learning with dimension reduction
    Tangkaratt, Voot
    Morimoto, Jun
    Sugiyama, Masashi
    NEURAL NETWORKS, 2016, 84 : 1 - 16
  • [50] On Effective Scheduling of Model-based Reinforcement Learning
    Lai, Hang
    Shen, Jian
    Zhang, Weinan
    Huang, Yimin
    Zhang, Xing
    Tang, Ruiming
    Yu, Yong
    Li, Zhenguo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34