Optimizing Efficiency of Machine Learning Based Hard Disk Failure Prediction by Two-Layer Classification-Based Feature Selection

被引:0
|
作者
Wang, Han [1 ]
Zhuge, Qingfeng [1 ]
Sha, Edwin Hsing-Mean [1 ]
Xu, Rui [1 ]
Song, Yuhong [1 ]
机构
[1] East China Normal Univ, Sch Comp Sci & Technol, Shanghai 200063, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 13期
关键词
ML; AI; disk failure prediction; timeliness; feature selection;
D O I
10.3390/app13137544
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Predicting hard disk failure effectively and efficiently can prevent the high costs of data loss for data storage systems. Disk failure prediction based on machine learning and artificial intelligence has gained notable attention, because of its good capabilities. Improving the accuracy and performance of disk failure prediction, however, is still a challenging problem. When disk failure is about to occur, the time is limited for the prediction process, including building models and predicting. Faster training would promote the efficiency of model updates, and late predictions not only have no value but also waste resources. To improve both the prediction quality and modeling timeliness, a two-layer classification-based feature selection scheme is proposed in this paper. An attribute filter calculating the importance of attributes was designed, to remove attributes insensitive to failure identification, where importance is gained based on the idea of classification tree models. Furthermore, by determining the correlation between features based on the correlation coefficient, an attribute classification method is proposed. In experiments, the models of machine learning and artificial intelligence were applied, and they included naive Bayesian, random forest, support vector machine, gradient boosted decision tree, convolutional neural networks, and long short-term memory. The results showed that the proposed technique could improve the prediction accuracy of ML/AI-based hard disk failure prediction models. Specifically, utilizing random forest and long short-term memory with the proposed technique showed the best accuracy. Meanwhile, the proposed scheme could reduce training and prediction latency by 75% and 83%, respectively, in the best case compared with the baseline methods.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Two-layer Decision Model Based on Noise Classification
    Liu Tingting
    Kang Kai
    Chou Li
    2016 INTERNATIONAL CONFERENCE ON ROBOTS & INTELLIGENT SYSTEM (ICRIS), 2016, : 469 - 472
  • [32] A Two-Layer Architecture for Failure Prediction Based on High-Dimension Monitoring Sequences
    Wang, Xue
    Liu, Fan
    Feng, Yixin
    Zhao, Jiabao
    COMPLEXITY, 2021, 2021
  • [33] Development of a two-layer machine learning model for the forensic application of legal and illegal poppy classification based on sequence data
    An, Hyung-Eun
    Mun, Min-Ho
    Malik, Adeel
    Kim, Chang-Bae
    FORENSIC SCIENCE INTERNATIONAL-GENETICS, 2024, 71
  • [34] A two-layer switching based trajectory prediction method
    Reisinger, Stefan
    Adelberger, Daniel
    del Re, Luigi
    European Journal of Control, 2021, 62 : 143 - 150
  • [35] A two-layer switching based trajectory prediction method
    Reisinger, Stefan
    Adelberger, Daniel
    del Re, Luigi
    EUROPEAN JOURNAL OF CONTROL, 2021, 62 : 143 - 150
  • [36] Classification-based machine learning approaches to predict the taste of molecules: A review
    Rojas, Cristian
    Ballabio, Davide
    Consonni, Viviana
    Suarez-Estrella, Diego
    Todeschini, Roberto
    FOOD RESEARCH INTERNATIONAL, 2023, 171
  • [37] Optimizing Endotracheal Suctioning Classification: Leveraging Prompt Engineering in Machine Learning for Feature Selection
    Islam, Mahera Roksana
    Ferdous, Anik Mahmud
    Hossain, Shahera
    Ahad, Md Atiqur Rahman
    Alnajjar, Fady
    2024 INTERNATIONAL CONFERENCE ON ACTIVITY AND BEHAVIOR COMPUTING, ABC 2024, 2024,
  • [38] Feature selection with Fast Correlation-Based Filter for Breast cancer prediction and Classification using Machine Learning Algorithms
    Khourdifi, Youness
    Bahaj, Mohamed
    2018 INTERNATIONAL SYMPOSIUM ON ADVANCED ELECTRICAL AND COMMUNICATION TECHNOLOGIES (ISAECT), 2018,
  • [39] Feature selection and fault-severity classification-based machine health assessment methodology for point machine sliding-chair degradation
    Atamuradov, Vepa
    Medjaher, Kamal
    Camci, Fatih
    Zerhouni, Noureddine
    Dersin, Pierre
    Lamoureux, Benjamin
    QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2019, 35 (04) : 1081 - 1099
  • [40] Filter-Based Feature Selection and Machine-Learning Classification of Cancer Data
    Farsi, Mohammed
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2021, 28 (01): : 83 - 92