Fault Tolerance of Cloud Infrastructure with Machine Learning

被引:2
|
作者
Kalaskar, Chetankumar [1 ]
Thangam, S. [1 ]
机构
[1] Amrita Vishwavidyapeetam, Amrita Sch Comp, Dept Comp Sci & Engn, Bangalore 560035, Karnataka, India
关键词
Cloud computing; Fault tolerance; Machine learning; Reliability of cloud;
D O I
10.2478/cait-2023-0034
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Enhancing the fault tolerance of cloud systems and accurately forecasting cloud performance are pivotal concerns in cloud computing research. This research addresses critical concerns in cloud computing by enhancing fault tolerance and forecasting cloud performance using machine learning models. Leveraging the Google trace dataset with 10000 cloud environment records encompassing diverse metrics, we systematically have employed machine learning algorithms, including linear regression, decision trees, and gradient boosting, to construct predictive models. These models have outperformed baseline methods, with C5.0 and XGBoost showing exceptional accuracy, precision, and reliability in forecasting cloud behavior. Feature importance analysis has identified the ten most influential factors affecting cloud system performance. This work significantly advances cloud optimization and reliability, enabling proactive monitoring, early performance issue detection, and improved fault tolerance. Future research can further refine these predictive models, enhancing cloud resource management and ultimately improving service delivery in cloud computing.
引用
收藏
页码:26 / 50
页数:25
相关论文
共 50 条
  • [31] Chameleon: A software infrastructure for adaptive fault tolerance
    Kalbarczyk, ZT
    Iyer, RK
    Bagchi, S
    Whisnant, K
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1999, 10 (06) : 560 - 579
  • [32] Quantum machine learning: a systematic categorization based on learning paradigms, NISQ suitability, and fault tolerance
    Bisma Majid
    Shabir Ahmed Sofi
    Zamrooda Jabeen
    Quantum Machine Intelligence, 2025, 7 (1)
  • [33] Cognitive agent based fault tolerance in ubiquitous networks: a machine learning approach
    Bhajantri L.B.
    Ayyannavar V.V.
    International Journal of Information Technology, 2024, 16 (4) : 2363 - 2377
  • [34] Study on fault tolerance method in cloud platform based on workload consolidation model of virtual machine
    Li Z.
    Liu L.
    Tong Z.
    Journal of Engineering Science and Technology Review, 2017, 10 (05) : 41 - 49
  • [35] Fault tolerance and quality of service aware virtual machine scheduling algorithm in cloud data centers
    Heyang Xu
    Sen Xu
    Wei Wei
    Naixuan Guo
    The Journal of Supercomputing, 2023, 79 : 2603 - 2625
  • [36] Fault tolerance and quality of service aware virtual machine scheduling algorithm in cloud data centers
    Xu, Heyang
    Xu, Sen
    Wei, Wei
    Guo, Naixuan
    JOURNAL OF SUPERCOMPUTING, 2023, 79 (03): : 2603 - 2625
  • [37] Cloud-based fault tolerance of safety control in the cloud
    Fischer M.
    Walker M.
    Lechler A.
    Riedel O.
    Verl A.
    WT Werkstattstechnik, 2023, 113 (05): : 189 - 194
  • [38] Securing Machine Learning in the Cloud: A Systematic Review of Cloud Machine Learning Security
    Qayyum, Adnan
    Ijaz, Aneeqa
    Usama, Muhammad
    Iqbal, Waleed
    Qadir, Junaid
    Elkhatib, Yehia
    Al-Fuqaha, Ala
    FRONTIERS IN BIG DATA, 2020, 3
  • [39] Dynamic Approach Based on Learning Automata for Data Fault-Tolerance in the Cloud Storage
    Hosseini, Seyyed Mansour
    Arani, Mostafa Ghobaei
    Kenari, Abdol Reza Rasouli
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2015, 8 (06): : 91 - 103
  • [40] Infrastructure Fault Detection and Prediction in Edge Cloud Environments
    Soualhia, Mbarka
    Fu, Chunyan
    Khomh, Foutse
    SEC'19: PROCEEDINGS OF THE 4TH ACM/IEEE SYMPOSIUM ON EDGE COMPUTING, 2019, : 222 - 235