KubePipe: a container-based high-level parallelization tool for scalable machine learning pipelines

被引:0
|
作者
Suarez, Daniel [1 ]
Almeida, Francisco [1 ]
Blanco, Vicente [1 ]
Toledo, Pedro [1 ]
机构
[1] Univ La Laguna ULL, Comp Sci & Syst Dept, San Francisco de Paula s-n, San Cristobal la Laguna 38270, Spain
来源
JOURNAL OF SUPERCOMPUTING | 2025年 / 81卷 / 03期
关键词
Kubernetes; Machine learning; Containerization; Parallel computing; High-performance computing; Pipeline Optimization;
D O I
10.1007/s11227-025-06956-x
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As the complexity and scale of machine learning applications continue to grow, the need for efficient training methodologies becomes increasingly critical. Traditional training processes can be time-intensive, often limiting rapid development and deployment. In response to this challenge, we present KubePipe, a high-level tool that abstracts parallelism and containerization from the user, allowing non-expert users to leverage advanced parallel architectures without requiring deep knowledge of parallel computing or container orchestration. KubePipe enables the concurrent execution of multiple machine learning workflows within a Kubernetes cluster, optimizing computational resources and significantly reducing training times. By leveraging containerized environments, KubePipe ensures a high degree of modularity, scalability, and portability, making it adaptable to various machine learning frameworks and tasks. Our experimental results demonstrate substantial performance improvements when using KubePipe compared to conventional pipeline implementations. This paper explores the architecture and functionality of KubePipe, providing insights into its integration with existing machine learning systems and highlighting its potential to streamline the training process in high-performance computing environments.
引用
收藏
页数:35
相关论文
共 50 条
  • [1] DRPC: Distributed Reinforcement Learning Approach for Scalable Resource Provisioning in Container-Based Clusters
    Bai, Haoyu
    Xu, Minxian
    Ye, Kejiang
    Buyya, Rajkumar
    Xu, Chengzhong
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2024, 17 (06) : 3473 - 3484
  • [2] Thoth: Automatic Resource Management with Machine Learning for Container-based Cloud Platform
    Sangpetch, Akkarit
    Sangpetch, Orathai
    Juangmarisakul, Nut
    Warodom, Supakorn
    CLOSER: PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE, 2017, : 75 - 83
  • [3] High-level student modeling with machine learning
    Beck, JE
    Woolf, BP
    INTELLIGENT TUTORING SYSTEMS, PROCEEDINGS, 2000, 1839 : 584 - 593
  • [4] Parallelization of Recursive Function in Ruby-Based High-Level Synthesis
    Yamashita, Ryota
    Teruya, Daichi
    Nakajo, Hironori
    2019 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2019), 2019, : 407 - 410
  • [5] A Web-Based Tool for Learning Digital Circuit High-Level Modeling
    Trost, Andrej
    Zemva, Andrej
    INTERNATIONAL JOURNAL OF ENGINEERING EDUCATION, 2019, 35 (04) : 1224 - 1237
  • [6] Machine Learning Based Routing Congestion Prediction in FPGA High-Level Synthesis
    Zhao, Jieru
    Liang, Tingyuan
    Sinha, Sharad
    Zhang, Wei
    2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 1130 - 1135
  • [7] High-Level Online Power Monitoring of FPGA IP Based on Machine Learning
    Richa, Majdi
    Prevotet, Jean-Christophe
    Dardaillon, Mickael
    Mroue, Mohamad
    Samhat, Abed Ellatif
    DESIGN AND ARCHITECTURE FOR SIGNAL AND IMAGE PROCESSING, DASIP 2023, 2023, 13879 : 107 - 119
  • [8] High-Level Early Power Estimation of FPGA IP Based on Machine Learning
    Richa, Majdi
    Prevotet, Jean-Christophe
    Dardaillon, Mickael
    Mroue, Mohamad
    Samhat, Abed Ellatif
    2022 29TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS (IEEE ICECS 2022), 2022,
  • [9] NeuPow: A CAD Methodology for High-level Power Estimation Based on Machine Learning
    Nasser, Yehya
    Sau, Carlo
    Prevotet, Jean-Christophe
    Fanni, Tiziana
    Palumbo, Francesca
    Helard, Maryline
    Raffo, Luigi
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2020, 25 (05)
  • [10] Investigating Machine Learning Algorithms for Modeling SSD I/O Performance for Container-Based Virtualization
    Dartois, Jean-Emile
    Boukhobza, Jalil
    Knefati, Anas
    Barais, Olivier
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2021, 9 (03) : 1103 - 1116