An empirical study of the naïve REINFORCE algorithm for predictive maintenance

被引:0
|
作者
Siraskar, Rajesh [1 ,5 ]
Kumar, Satish [1 ,2 ]
Patil, Shruti [1 ,2 ]
Bongale, Arunkumar [1 ]
Kotecha, Ketan [1 ,2 ,4 ]
Kulkarni, Ambarish [3 ]
机构
[1] Symbiosis Int Deemed Univ, Symbiosis Inst Technol, Pune Campus, Pune, India
[2] Symbiosis Int Deemed Univ, Symbiosis Ctr Appl Artificial Intelligence, Pune, India
[3] Swinburne Univ Technol, Hawthorn 3122, Australia
[4] RUDN Univ, People Friendship Univ Russia, Miklukho Maklaya Str 6, Moscow 117198, Russia
[5] Birlasoft Ltd, CTO Off, Pune 411057, India
关键词
Reinforcement learning; Predictive maintenance; REINFORCE;
D O I
10.1007/s42452-025-06613-1
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Reinforcement Learning (RL) is a biologically inspired, autonomous machine learning method. RL algorithms can help generate optimal predictive maintenance (PdM) policies for complex industrial systems. However, these algorithms are extremely sensitive to hyperparameter tuning and network architecture, and this is where automated RL frameworks (AutoRL) can offer a platform to encourage industrial practitioners to apply RL to their problems. AutoRL applied to PdM has yet to be studied. Aimed at practitioners unfamiliar with complex RL tuning, we undertake an empirical study to understand untuned RL algorithms for generating optimal tool replacement policies for milling machines. We compare a na & iuml;ve implementation of REINFORCE against the policies of industry-grade implementations of three advanced algorithms - Deep Q-Network (DQN), Advantage Actor-Critic (A2C), and Proximal Policy Optimization (PPO). Our broad goal was to study model performance under four scenarios: (1) simulated tool-wear data, (2) actual tool-wear data (benchmark IEEEDataPort PHM Society datasets), (3) univariate state with added noise levels and a random chance of break-down, and finally (4) complex multivariate state. Across 15 environment variants, REINFORCE models demonstrated higher tool replacement precision 0.687, recall 0.629 and F1 0.609 against A2C (0.449/0.480/0.442), DQN (0.418/0.504/0.374) and PPO (0.472/0.316/0.345), while demonstrating lower variability. Comparing the best auto-selected model, over ten training rounds produced unusually wider performance gaps with the REINFORCE precision, recall and F1 at 0.884, 0.884, 0.873 against the best A2C (0.520/0.859/0.639), DQN (0.651/0.937/0.740), and PPO (0.558/0.643/0.580) models. For the REINFORCE, a basic hyperparameter sensitivity and interaction analysis is conducted to better understand the dynamics and present results for the hyperparameters learning rate, discount factor gamma\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma$$\end{document} and the network activation functions (ReLU and Tanh). Our study suggests that, in the untuned state, simpler algorithms like the REINFORCE perform reasonably well. For AutoRL frameworks, this research encourages seeking new design approaches to automatically identify optimum algorithm-hyperparameter combinations.
引用
收藏
页数:37
相关论文
共 50 条
  • [1] Engine gearbox fault diagnosis using empirical mode decomposition method and Naïve Bayes algorithm
    Kiran Vernekar
    Hemantha Kumar
    K V Gangadharan
    Sādhanā, 2017, 42 : 1143 - 1153
  • [2] Building an algorithm for predictive maintenance
    Abiad, Mohammad
    Ionescu, Sorin
    UPB Scientific Bulletin, Series D: Mechanical Engineering, 2020, 82 (04): : 337 - 348
  • [3] A naïve five-element string algorithm
    Cui, Yanhong
    Guo, Renkuan
    Guo, Danni
    Journal of Software, 2009, 4 (09) : 925 - 934
  • [4] Na?ve Bayes Algorithm for Large Scale Text Classification
    Pirunthavi SIVAKUMAR
    Jayalath EKANAYAKE
    Instrumentation, 2021, 8 (04) : 55 - 62
  • [5] A naïve HMO study of the casimir effect
    Ramon Carbó-Dorca
    Journal of Mathematical Chemistry, 2022, 60 : 581 - 585
  • [6] Predictive Maintenance for a Ventilator Using LSTM Algorithm
    Ruhiyat, Yusuf Hamzah
    Sumaryo, Sony
    Susanto, Erwin
    2022 IEEE ASIA PACIFIC CONFERENCE ON WIRELESS AND MOBILE (APWIMOB), 2022, : 108 - 111
  • [7] A general prognostic tracking algorithm for predictive maintenance
    Swanson, DC
    2001 IEEE AEROSPACE CONFERENCE PROCEEDINGS, VOLS 1-7, 2001, : 2971 - 2977
  • [8] Adaboost algorithm in the frame of predictive maintenance tasks
    Vasilic, Predrag
    Vujnovic, Sanja
    Popovic, Nikola
    Marjanovic, Aleksandra
    Durovic, Zeljko
    2018 23RD INTERNATIONAL SCIENTIFIC-PROFESSIONAL CONFERENCE ON INFORMATION TECHNOLOGY (IT), 2018,
  • [9] Study on predictive maintenance strategy
    1600, Science and Engineering Research Support Society (09):
  • [10] Spam message classification based on the naïve Bayes classification algorithm
    Ning, Bin
    Junwei, Wu
    Feng, Hu
    IAENG International Journal of Computer Science, 2019, 46 (01)