Performance Optimization of Machine Learning Algorithms Based on Spark

被引:0
|
作者
Luo W. [1 ]
Zhang S. [2 ]
Xu Y. [1 ]
机构
[1] School of Information Management, Jiangxi University of Finance and Economics, Jiangxi, Nanchang
[2] College of Software Engineering, Guangxi Normal University, Guangxi, Guilin
来源
关键词
Machine learning algorithm; RDD; Shuffle; Spark;
D O I
10.2478/amns-2024-0416
中图分类号
学科分类号
摘要
This paper proposes a performance optimization strategy for Spark-based machine learning algorithms in Shuffle and memory data management modules. The Shuffle module is optimized by introducing Observer monitoring module in Spark cluster to achieve task status monitoring and dynamic ShuffleWrite task generation. Meanwhile, an adaptive caching mechanism for RDD data addresses the lack of in-memory data caching. The performance-optimized algorithm performs well in the experiments, with a clustering accuracy of 89% and a response time that is 5% faster than the Random Forest algorithm. In road network traffic state discrimination, the optimized algorithm's classification decision F-measure value is as high as 99.53%, which is 5.32% higher than that before unoptimization, and the running time is 767 seconds less than that of the unoptimized algorithm when dealing with about 6, 880, 000 pieces of data, which significantly improves the efficiency and accuracy. © 2023 Weikang Luo, Shenglin Zhang and Yinggen Xu, published by Sciendo.
引用
收藏
相关论文
共 50 条
  • [31] MACHINE LEARNING BASED OPTIMIZATION APPROACH FOR BUILDING ENERGY PERFORMANCE
    Solmaz, Aslihan Senel
    2020 ASHRAE BUILDING PERFORMANCE ANALYSIS CONFERENCE AND SIMBUILD, 2020, : 69 - 76
  • [32] Optimization Design of Airfoil Hydrodynamic Performance Based on Machine Learning
    Li, Yangjian
    Li, Ziru
    Liu, Qian
    He, Wei
    Ship Building of China, 2024, 65 (01) : 176 - 189
  • [33] Slope Stability Prediction Method Based on Intelligent Optimization and Machine Learning Algorithms
    Yang, Yukun
    Zhou, Wei
    Jiskani, Izhar Mithal
    Lu, Xiang
    Wang, Zhiming
    Luan, Boyu
    SUSTAINABILITY, 2023, 15 (02)
  • [34] Optimization of Cooling Jacket Geometry Based on Numerical Modeling and Machine Learning Algorithms
    Malikov, Azamatjon Kakhramon ugli
    Lee, Jaeseung
    INTERNATIONAL JOURNAL OF AUTOMOTIVE TECHNOLOGY, 2025,
  • [35] Research progress on mixture optimization of concrete based on machine learning and metaheuristic algorithms
    Chen, Yun
    Fu, Qianwang
    Fuhe Cailiao Xuebao/Acta Materiae Compositae Sinica, 2024, 41 (11): : 5689 - 5716
  • [36] On Machine Learning-based Stage-aware Performance Prediction of Spark Applications
    Ye, Guangjun
    Liu, Wuji
    Wu, Chase Q.
    Shen, Wei
    Lyu, Xukang
    2020 IEEE 39TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2020,
  • [37] Performance of Machine Learning Algorithms and Diversity in Data
    Sug, Hyontai
    22ND INTERNATIONAL CONFERENCE ON CIRCUITS, SYSTEMS, COMMUNICATIONS AND COMPUTERS (CSCC 2018), 2018, 210
  • [38] Machine learning algorithms for monitoring pavement performance
    Cano-Ortiz, Saul
    Pascual-Munoz, Pablo
    Castro-Fresno, Daniel
    AUTOMATION IN CONSTRUCTION, 2022, 139
  • [39] Performance of Machine Learning Algorithms for IT Incident Management
    Prihandono, Mohammad Agus
    Harwahyu, Ruki
    Sari, Riri Fitri
    2020 11TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST), 2020,
  • [40] Experimental Performance Analysis of Machine Learning Algorithms
    Khekare, Ganesh
    Turukmane, Anil V.
    Dhule, Chetan
    Sharma, Pooja
    Kumar Bramhane, Lokesh
    Lecture Notes in Electrical Engineering, 2022, 942 LNEE : 1041 - 1052