Cloud failure prediction based on traditional machine learning and deep learning

被引:9
|
作者
Asmawi, Tengku Nazmi Tengku [1 ]
Ismail, Azlan [1 ,2 ]
Shen, Jun [3 ]
机构
[1] Univ Teknol MARA UiTM, Fac Comp & Math Sci FSKM, Shah Alam 40450, Selangor, Malaysia
[2] Univ Teknol MARA UiTM, Kompleks Al Khawarizmi, Inst Big Data Analyt & Artificial Intelligence IB, Shah Alam 40450, Selangor, Malaysia
[3] Univ Wollongong, Fac Engn & Informat Sci, Sch Comp & Informat Technol, Wollongong, NSW 2522, Australia
关键词
Cloud computing; Job and task failure; Failure prediction; Deep learning; Machine learning; ARCHITECTURE;
D O I
10.1186/s13677-022-00327-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud failure is one of the critical issues since it can cost millions of dollars to cloud service providers, in addition to the loss of productivity suffered by industrial users. Fault tolerance management is the key approach to address this issue, and failure prediction is one of the techniques to prevent the occurrence of a failure. One of the main challenges in performing failure prediction is to produce a highly accurate predictive model. Although some work on failure prediction models has been proposed, there is still a lack of a comprehensive evaluation of models based on different types of machine learning algorithms. Therefore, in this paper, we propose a comprehensive comparison and model evaluation for predictive models for job and task failure. These models are built and trained using five traditional machine learning algorithms and three variants of deep learning algorithms. We use a benchmark dataset, called Google Cloud Traces, for training and testing the models. We evaluated the performance of models using multiple metrics and determined their important features, as well as measured their scalability. Our analysis resulted in the following findings. Firstly, in the case of job failure prediction, we found that Extreme Gradient Boosting produces the best model where the disk space request and CPU request are the most important features that influence the prediction. Second, for task failure prediction, we found that Decision Tree and Random Forest produce the best models where the priority of the task is the most important feature for both models. Our scalability analysis has determined that the Logistic Regression model is the most scalable as compared to others.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Enhancing estuary salinity prediction: A Machine Learning and Deep Learning based approach
    Saccotelli, Leonardo
    Verri, Giorgia
    De Lorenzis, Alessandro
    Cherubini, Carla
    Caccioppoli, Rocco
    Coppini, Giovanni
    Maglietta, Rosalia
    APPLIED COMPUTING AND GEOSCIENCES, 2024, 23
  • [22] News-based Machine Learning and Deep Learning Methods for Stock Prediction
    Guo, Junjie
    Tuckfield, Bradford
    4TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE APPLICATIONS AND TECHNOLOGIES (AIAAT 2020), 2020, 1642
  • [23] QoS Prediction Model of Cloud Services Based on Deep Learning
    Huang, WenJun
    Zhang, PeiYun
    Chen, YuTong
    Zhou, MengChu
    Al-Turki, Yusuf
    Abusorrah, Abdullah
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2022, 9 (03) : 564 - 566
  • [24] Deep Learning-based fault prediction in cloud system
    Dinh Dai Vu
    Xuan Tuong Vu
    Kim, Younghan
    12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 1826 - 1829
  • [25] QoS Prediction Model of Cloud Services Based on Deep Learning
    WenJun Huang
    PeiYun Zhang
    YuTong Chen
    MengChu Zhou
    Yusuf Al-Turki
    Abdullah Abusorrah
    IEEE/CAAJournalofAutomaticaSinica, 2022, 9 (03) : 564 - 566
  • [26] Analysis of Job Failure and Prediction Model for Cloud Computing Using Machine Learning
    Jassas, Mohammad S.
    Mahmoud, Qusay H.
    SENSORS, 2022, 22 (05)
  • [27] A Failure Prediction Model for Large Scale Cloud Applications using Deep Learning
    Jassas, Mohammad S.
    Mahmoud, Qusay H.
    2021 15TH ANNUAL IEEE INTERNATIONAL SYSTEMS CONFERENCE (SYSCON 2021), 2021,
  • [28] Machine learning for total cloud cover prediction
    Ágnes Baran
    Sebastian Lerch
    Mehrez El Ayari
    Sándor Baran
    Neural Computing and Applications, 2021, 33 : 2605 - 2620
  • [29] Machine learning for total cloud cover prediction
    Baran, Agnes
    Lerch, Sebastian
    El Ayari, Mehrez
    Baran, Sandor
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (07): : 2605 - 2620
  • [30] Traditional Machine Learning based on Atmospheric Conditions for Prediction of Dengue Presence
    Lopez, Brenda Sofia Sanchez
    Nolberto, Daniela Candioti
    Gutierrez, Jose Antonio Taquia
    Lopez, Yvan Garcia
    COMPUTACION Y SISTEMAS, 2023, 27 (03): : 769 - 777