An empirical study of the naïve REINFORCE algorithm for predictive maintenance

被引:0
|
作者
Siraskar, Rajesh [1 ,5 ]
Kumar, Satish [1 ,2 ]
Patil, Shruti [1 ,2 ]
Bongale, Arunkumar [1 ]
Kotecha, Ketan [1 ,2 ,4 ]
Kulkarni, Ambarish [3 ]
机构
[1] Symbiosis Int Deemed Univ, Symbiosis Inst Technol, Pune Campus, Pune, India
[2] Symbiosis Int Deemed Univ, Symbiosis Ctr Appl Artificial Intelligence, Pune, India
[3] Swinburne Univ Technol, Hawthorn 3122, Australia
[4] RUDN Univ, People Friendship Univ Russia, Miklukho Maklaya Str 6, Moscow 117198, Russia
[5] Birlasoft Ltd, CTO Off, Pune 411057, India
关键词
Reinforcement learning; Predictive maintenance; REINFORCE;
D O I
10.1007/s42452-025-06613-1
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Reinforcement Learning (RL) is a biologically inspired, autonomous machine learning method. RL algorithms can help generate optimal predictive maintenance (PdM) policies for complex industrial systems. However, these algorithms are extremely sensitive to hyperparameter tuning and network architecture, and this is where automated RL frameworks (AutoRL) can offer a platform to encourage industrial practitioners to apply RL to their problems. AutoRL applied to PdM has yet to be studied. Aimed at practitioners unfamiliar with complex RL tuning, we undertake an empirical study to understand untuned RL algorithms for generating optimal tool replacement policies for milling machines. We compare a na & iuml;ve implementation of REINFORCE against the policies of industry-grade implementations of three advanced algorithms - Deep Q-Network (DQN), Advantage Actor-Critic (A2C), and Proximal Policy Optimization (PPO). Our broad goal was to study model performance under four scenarios: (1) simulated tool-wear data, (2) actual tool-wear data (benchmark IEEEDataPort PHM Society datasets), (3) univariate state with added noise levels and a random chance of break-down, and finally (4) complex multivariate state. Across 15 environment variants, REINFORCE models demonstrated higher tool replacement precision 0.687, recall 0.629 and F1 0.609 against A2C (0.449/0.480/0.442), DQN (0.418/0.504/0.374) and PPO (0.472/0.316/0.345), while demonstrating lower variability. Comparing the best auto-selected model, over ten training rounds produced unusually wider performance gaps with the REINFORCE precision, recall and F1 at 0.884, 0.884, 0.873 against the best A2C (0.520/0.859/0.639), DQN (0.651/0.937/0.740), and PPO (0.558/0.643/0.580) models. For the REINFORCE, a basic hyperparameter sensitivity and interaction analysis is conducted to better understand the dynamics and present results for the hyperparameters learning rate, discount factor gamma\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma$$\end{document} and the network activation functions (ReLU and Tanh). Our study suggests that, in the untuned state, simpler algorithms like the REINFORCE perform reasonably well. For AutoRL frameworks, this research encourages seeking new design approaches to automatically identify optimum algorithm-hyperparameter combinations.
引用
收藏
页数:37
相关论文
共 50 条
  • [21] Digital Predictive Maintenance: Case Study
    Benesova, Andrea
    Hirman, Martin
    Steiner, Frantisek
    Tupa, Jiri
    2024 INTERNATIONAL CONFERENCE ON DIAGNOSTICS IN ELECTRICAL ENGINEERING, DIAGNOSTIKA 2024, 2024, : 168 - 173
  • [22] An empirical study of distributed software maintenance
    Bianchi, A
    Caivano, D
    Lanubile, F
    Rago, F
    Visaggio, G
    INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 2002, : 103 - 109
  • [23] A Semantic Model in the Context of Maintenance: A Predictive Maintenance Case Study
    May, Gokan
    Cho, Sangje
    Majidirad, AmirHossein
    Kiritsis, Dimitris
    APPLIED SCIENCES-BASEL, 2022, 12 (12):
  • [24] Belief elicitation in the presence of naïve respondents: An experimental study
    Li Hao
    Daniel Houser
    Journal of Risk and Uncertainty, 2012, 44 : 161 - 180
  • [25] Applicability of Algorithm Evaluation Metrics for Predictive Maintenance in Production Systems
    Engbers, Hendrik
    Alla, Abderrahim Ait
    Kreutz, Markus
    Freitag, Michael
    2020 6TH IEEE CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY (IEEE CIST'20), 2020, : 413 - 418
  • [26] Selecting an appropriate supervised machine learning algorithm for predictive maintenance
    Abdelfettah Ouadah
    Leila Zemmouchi-Ghomari
    Nedjma Salhi
    The International Journal of Advanced Manufacturing Technology, 2022, 119 : 4277 - 4301
  • [27] A conceptual framework for machine learning algorithm selection for predictive maintenance
    Arena, Simone
    Florian, Eleonora
    Sgarbossa, Fabio
    Solvsberg, Endre
    Zennaro, Ilenia
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [28] Selecting an appropriate supervised machine learning algorithm for predictive maintenance
    Ouadah, Abdelfettah
    Zemmouchi-Ghomari, Leila
    Salhi, Nedjma
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2022, 119 (7-8): : 4277 - 4301
  • [29] Predictive Maintenance Algorithm Based on Machine Learning for Industrial Asset
    Alfaro-Nango, Angel J.
    Escobar-Gomez, Elias N.
    Chandomi-Castellanos, Eduardo
    Velazquez-Trujillo, Sabino
    Hernandez-de-Leon, Hector R.
    Blanco-Gonzalez, Lidya M.
    2022 8TH INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT'22), 2022, : 1489 - 1494
  • [30] NBA-Palm: prediction of palmitoylation site implemented in Naïve Bayes algorithm
    Yu Xue
    Hu Chen
    Changjiang Jin
    Zhirong Sun
    Xuebiao Yao
    BMC Bioinformatics, 7