Cloud failure prediction based on traditional machine learning and deep learning

被引:9
|
作者
Asmawi, Tengku Nazmi Tengku [1 ]
Ismail, Azlan [1 ,2 ]
Shen, Jun [3 ]
机构
[1] Univ Teknol MARA UiTM, Fac Comp & Math Sci FSKM, Shah Alam 40450, Selangor, Malaysia
[2] Univ Teknol MARA UiTM, Kompleks Al Khawarizmi, Inst Big Data Analyt & Artificial Intelligence IB, Shah Alam 40450, Selangor, Malaysia
[3] Univ Wollongong, Fac Engn & Informat Sci, Sch Comp & Informat Technol, Wollongong, NSW 2522, Australia
关键词
Cloud computing; Job and task failure; Failure prediction; Deep learning; Machine learning; ARCHITECTURE;
D O I
10.1186/s13677-022-00327-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud failure is one of the critical issues since it can cost millions of dollars to cloud service providers, in addition to the loss of productivity suffered by industrial users. Fault tolerance management is the key approach to address this issue, and failure prediction is one of the techniques to prevent the occurrence of a failure. One of the main challenges in performing failure prediction is to produce a highly accurate predictive model. Although some work on failure prediction models has been proposed, there is still a lack of a comprehensive evaluation of models based on different types of machine learning algorithms. Therefore, in this paper, we propose a comprehensive comparison and model evaluation for predictive models for job and task failure. These models are built and trained using five traditional machine learning algorithms and three variants of deep learning algorithms. We use a benchmark dataset, called Google Cloud Traces, for training and testing the models. We evaluated the performance of models using multiple metrics and determined their important features, as well as measured their scalability. Our analysis resulted in the following findings. Firstly, in the case of job failure prediction, we found that Extreme Gradient Boosting produces the best model where the disk space request and CPU request are the most important features that influence the prediction. Second, for task failure prediction, we found that Decision Tree and Random Forest produce the best models where the priority of the task is the most important feature for both models. Our scalability analysis has determined that the Logistic Regression model is the most scalable as compared to others.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Cloud failure prediction based on traditional machine learning and deep learning
    Tengku Nazmi Tengku Asmawi
    Azlan Ismail
    Jun Shen
    Journal of Cloud Computing, 11
  • [2] Cloud Service Failure Prediction on Google's Borg Cluster Traces Using Traditional Machine Learning
    Tuns, Adrian-Ioan
    Spataru, Adrian
    2023 25TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING, SYNASC 2023, 2023, : 162 - 169
  • [3] Deep Learning Versus Traditional Machine Learning Methods for Aggregated Energy Demand Prediction
    Paterakis, Nikolaos G.
    Mocanu, Elena
    Gibescu, Madeleine
    Stappers, Bart
    van Alst, Walter
    2017 IEEE PES INNOVATIVE SMART GRID TECHNOLOGIES CONFERENCE EUROPE (ISGT-EUROPE), 2017,
  • [4] A Survey on Hardware Failure Prediction of Servers Using Machine Learning and Deep Learning
    Georgoulopoulos, Nikolaos
    Hatzopoulos, Alkiviadis
    Karamitsios, Konstantinos
    Tabakis, Irene Maria
    Kotrotsios, Konstantinos
    Metsai, Alexandros, I
    2021 10TH INTERNATIONAL CONFERENCE ON MODERN CIRCUITS AND SYSTEMS TECHNOLOGIES (MOCAST), 2021,
  • [5] Machine Learning Based Workload Prediction in Cloud Computing
    Gao, Jiechao
    Wang, Haoyu
    Shen, Haiying
    2020 29TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS (ICCCN 2020), 2020,
  • [6] Deep Learning based Parking Prediction on Cloud Platform
    Li, Jiachang
    Li, Jiming
    Zhang, Haitao
    2018 4TH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING AND COMMUNICATIONS (BIGCOM 2018), 2018, : 132 - 137
  • [7] Feature Extraction Based on Deep Learning for Some Traditional Machine Learning Methods
    Cayir, Aykut
    Yenidogan, Isil
    Dag, Hasan
    2018 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2018, : 494 - 497
  • [8] Machine Learning and Deep Learning for Throughput Prediction
    Lee, Dongwon
    Lee, Joohyun
    12TH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN 2021), 2021, : 452 - 454
  • [9] Machine Learning and Deep Learning-Based Students’ Grade Prediction
    Korchi A.
    Messaoudi F.
    Abatal A.
    Manzali Y.
    Operations Research Forum, 4 (4)
  • [10] Wind Power Prediction Based on Machine Learning and Deep Learning Models
    Tarek, Zahraa
    Shams, Mahmoud Y.
    Elshewey, Ahmed M.
    El-kenawy, El-Sayed M.
    Ibrahim, Abdelhameed
    Abdelhamid, Abdelaziz A.
    El-dosuky, Mohamed A.
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (01): : 715 - 732