Predictive modelling of MapReduce job performance in cloud environments using machine learning techniques

被引:0
|
作者
Bergui, Mohammed [1 ]
Hourri, Soufiane [1 ,2 ]
Najah, Said [1 ]
Nikolov, Nikola S. [3 ]
机构
[1] Univ Sidi Mohammed Ben Abdellah, Fac Sci & Technol, Dept Comp Sci, Lab Intelligent Syst & Applicat, Fes, Morocco
[2] Univ Cadi Ayyad, Higher Sch Technol, Lab Proc Ind Signals & Comp Sci, Safi, Morocco
[3] Univ Limerick, Dept Comp Sci & Informat Syst, Limerick, Ireland
关键词
Hadoop; MapReduce; Big data; Performance modelling; Runtime prediction; Machine learning;
D O I
10.1186/s40537-024-00964-z
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Within the Hadoop ecosystem, MapReduce stands as a cornerstone for managing, processing, and mining large-scale datasets. Yet, the absence of efficient solutions for precise estimation of job execution times poses a persistent challenge, impacting task allocation and distribution within Hadoop clusters. In this study, we present a comprehensive machine learning approach for predicting the execution time of MapReduce jobs, encompassing data collection, preprocessing, feature engineering, and model evaluation. Leveraging a rich dataset derived from comprehensive Hadoop MapReduce job traces, we explore the intricate relationship between cluster parameters and job performance. Through a comparative analysis of machine learning models, including linear regression, decision tree, random forest, and gradient-boosted regression trees, we identify the random forest model as the most effective, demonstrating superior predictive accuracy and robustness. Our findings underscore the critical role of features such as data size and resource allocation in determining job performance. With this work, we aim to enhance resource management efficiency and enable more effective utilisation of cloud-based Hadoop clusters for large-scale data processing tasks.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] Predictive models for charitable giving using machine learning techniques
    Farrokhvar, Leily
    Ansari, Azadeh
    Kamali, Behrooz
    PLOS ONE, 2018, 13 (10):
  • [22] Predictive Analysis Of Breast Cancer Using Machine Learning Techniques
    Agrawal, Rashmi
    INGENIERIA SOLIDARIA, 2019, 15 (29):
  • [23] Predictive Analysis of Cervical Cancer Using Machine Learning Techniques
    Kumawat, Gaurav
    Vishwakarma, Santosh Kumar
    Chakrabarti, Prasun
    SMART TRENDS IN COMPUTING AND COMMUNICATIONS, VOL 1, SMARTCOM 2024, 2024, 945 : 501 - 516
  • [24] Predictive models for diabetes mellitus using machine learning techniques
    Hang Lai
    Huaxiong Huang
    Karim Keshavjee
    Aziz Guergachi
    Xin Gao
    BMC Endocrine Disorders, 19
  • [25] Comparative Analysis of Machine Learning Techniques Using Predictive Modeling
    Khandelwal, Ritu
    Goyal, Hemlata
    Shekhawat, Rajveer S.
    Recent Advances in Computer Science and Communications, 2022, 15 (03) : 466 - 477
  • [26] Evaluating machine learning prediction techniques and their impact on proactive resource provisioning for cloud environments
    Kirchoff, Dionatra F.
    Meyer, Vinicius
    Calheiros, Rodrigo N.
    De Rose, Cesar A. F.
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (15): : 21920 - 21951
  • [27] Machine learning for predictive electrical performance using OCD
    Das, Sayantan
    Hung, Joey
    Halder, Sandip
    Koret, Roy
    Turovets, Igor
    Saib, Mohamed
    Charley, Anne-Laure
    Sendelbach, Matthew
    Ger, Avron
    Leray, Philippe
    METROLOGY, INSPECTION, AND PROCESS CONTROL FOR MICROLITHOGRAPHY XXXIII, 2019, 10959
  • [28] Using Big Data and Predictive Machine Learning in Aerospace Test Environments
    Armes, Tom
    Refern, Mark
    2013 IEEE AUTOTESTCON, 2013,
  • [29] Modelling Photovoltaic power output using Machine Learning techniques
    May, Siyasanga Innocent
    Bokoro, Pitshou
    Pratt, Lawrence
    Roro, Kittessa
    2022 IEEE PES/IAS PowerAfrica, PowerAfrica 2022, 2022,
  • [30] Modelling Photovoltaic power output using Machine Learning techniques
    May, Siyasanga Innocent
    Bokoro, Pitshou
    Pratt, Lawrence
    Roro, Kittessa
    2022 IEEE PES/IAS POWERAFRICA CONFERENCE, 2022, : 350 - 354