Performance Optimization of Machine Learning Algorithms Based on Spark

被引:0
|
作者
Luo W. [1 ]
Zhang S. [2 ]
Xu Y. [1 ]
机构
[1] School of Information Management, Jiangxi University of Finance and Economics, Jiangxi, Nanchang
[2] College of Software Engineering, Guangxi Normal University, Guangxi, Guilin
来源
关键词
Machine learning algorithm; RDD; Shuffle; Spark;
D O I
10.2478/amns-2024-0416
中图分类号
学科分类号
摘要
This paper proposes a performance optimization strategy for Spark-based machine learning algorithms in Shuffle and memory data management modules. The Shuffle module is optimized by introducing Observer monitoring module in Spark cluster to achieve task status monitoring and dynamic ShuffleWrite task generation. Meanwhile, an adaptive caching mechanism for RDD data addresses the lack of in-memory data caching. The performance-optimized algorithm performs well in the experiments, with a clustering accuracy of 89% and a response time that is 5% faster than the Random Forest algorithm. In road network traffic state discrimination, the optimized algorithm's classification decision F-measure value is as high as 99.53%, which is 5.32% higher than that before unoptimization, and the running time is 767 seconds less than that of the unoptimized algorithm when dealing with about 6, 880, 000 pieces of data, which significantly improves the efficiency and accuracy. © 2023 Weikang Luo, Shenglin Zhang and Yinggen Xu, published by Sciendo.
引用
收藏
相关论文
共 50 条
  • [41] Simulation and Optimization of Venturi Injector by Machine Learning Algorithms
    Wang, Haitao
    Wang, Jiandong
    Yang, Bin
    Mo, Yan
    Zhang, Yanqun
    Ma, Xiaopeng
    JOURNAL OF IRRIGATION AND DRAINAGE ENGINEERING, 2020, 146 (08)
  • [42] Propeller optimization by interactive genetic algorithms and machine learning
    Gypa, Ioli
    Jansson, Marcus
    Wolff, Krister
    Bensow, Rickard
    SHIP TECHNOLOGY RESEARCH, 2023, 70 (01) : 56 - 71
  • [43] On hyperparameter optimization of machine learning algorithms: Theory and practice
    Yang, Li
    Shami, Abdallah
    NEUROCOMPUTING, 2020, 415 : 295 - 316
  • [44] Optimization of machine learning algorithms for remote alteration mapping
    Bahrami, Yousef
    Hassani, Hossein
    ADVANCES IN SPACE RESEARCH, 2024, 74 (04) : 1609 - 1632
  • [45] ALGORITHMS FOR SUPERVISED MACHINE LEARNING- BASED STRUCTURAL PERFORMANCE EVALUATION FRAMEWORK
    Wang, Xiaowei
    Heo, YeongAe
    PROCEEDINGS OF THE ASME 39TH INTERNATIONAL CONFERENCE ON OCEAN, OFFSHORE AND ARCTIC ENGINEERING, OMAE2020, VOL 2A, 2020,
  • [46] A practical decision process for building facade performance optimization by integrating machine learning and evolutionary algorithms
    Lin, Chuan-Hsuan
    Tsay, Yaw-Shyan
    JOURNAL OF ASIAN ARCHITECTURE AND BUILDING ENGINEERING, 2024, 23 (02) : 740 - 753
  • [47] Application of machine learning algorithms for the optimization of the fabrication process of steel springs to improve their fatigue performance
    Ruiz, Estela
    Ferreno, Diego
    Cuartas, Miguel
    Arroyo, Borja
    Carrascal, Isidro A.
    Rivas, Isaac
    Gutierrez-Solana, Federico
    INTERNATIONAL JOURNAL OF FATIGUE, 2022, 159
  • [48] Enhancing the performance of extreme learning machine technique using optimization algorithms for embedded workload characterization
    Shritharanyaa, J. P.
    Kumar, R. Saravana
    Kumar, C.
    Alwabli, Abdullah
    Jaffar, Amar Y.
    Alshawi, Bandar
    ALEXANDRIA ENGINEERING JOURNAL, 2024, 108 : 509 - 517
  • [49] Estimation of Farmland Soil Salinity Content Based on Feature Optimization and Machine Learning Algorithms
    Han W.
    Cui J.
    Gui X.
    Ma W.
    Li G.
    Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2023, 54 (03): : 328 - 337
  • [50] Machine learning algorithms evaluation and optimization of WEDM of nickel based super alloy: A review
    Sudhir
    Sehgal, Anuj Kumar
    Nain, Somvir Singh
    MATERIALS TODAY-PROCEEDINGS, 2022, 50 : 1793 - 1798