Data-Driven Job Dispatching in HPC Systems

被引:12
|
作者
Galleguillos, Cristian [1 ,2 ]
Sirbu, Alina [3 ]
Kiziltan, Zeynep [1 ]
Babaoglu, Ozalp [1 ]
Borghesi, Andrea [1 ]
Bridi, Thomas [1 ]
机构
[1] Univ Bologna, Dept Comp Sci & Engn, Bologna, Italy
[2] Pontificia Univ Catolica Valparaiso, Escuela Ingn Informat, Valparaiso, Chile
[3] Univ Pisa, Dept Comp Sci, Pisa, Italy
关键词
D O I
10.1007/978-3-319-72926-8_37
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As High Performance Computing (HPC) systems get closer to exascale performance, job dispatching strategies become critical for keeping system utilization high while keeping waiting times low for jobs competing for HPC system resources. In this paper, we take a data-driven approach and investigate whether better dispatching decisions can be made by transforming the log data produced by an HPC system into useful knowledge about its workload. In particular, we focus on job duration, develop a data-driven approach to job duration prediction, and analyze the effect of different prediction approaches in making dispatching decisions using a real workload dataset collected from Eurora, a hybrid HPC system. Experiments on various dispatching methods show promising results.
引用
收藏
页码:449 / 461
页数:13
相关论文
共 50 条
  • [1] Hyperparameter optimization of data-driven AI models on HPC systems
    Wulff, Eric
    Girone, Maria
    Pata, Joosep
    20TH INTERNATIONAL WORKSHOP ON ADVANCED COMPUTING AND ANALYSIS TECHNIQUES IN PHYSICS RESEARCH, 2023, 2438
  • [2] Exploring HPC Parallelism with Data-Driven Multithreating
    Christofi, Constantinos
    Michael, George
    Trancoso, Pedro
    Evripidou, Paraskevas
    2012 SECOND WORKSHOP ON DATA-FLOW EXECUTION MODELS FOR EXTREME SCALE COMPUTING (DFM 2012), 2012, : 10 - 17
  • [3] AccaSim: a customizable workload management simulator for job dispatching research in HPC systems
    Cristian Galleguillos
    Zeynep Kiziltan
    Alessio Netti
    Ricardo Soto
    Cluster Computing, 2020, 23 : 107 - 122
  • [4] AccaSim: a customizable workload management simulator for job dispatching research in HPC systems
    Galleguillos, Cristian
    Kiziltan, Zeynep
    Netti, Alessio
    Soto, Ricardo
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2020, 23 (01): : 107 - 122
  • [5] Data-Driven Job Capability Profiling
    Liu, Rong
    Agrawal, Bhavna
    Vempaty, Aditya
    Sherchan, Wanita
    Sin, Sherry
    Tan, Michael
    ARTIFICIAL INTELLIGENCE IN EDUCATION, PT II, 2018, 10948 : 187 - 192
  • [6] Log Analytics in HPC: A Data-driven Reinforcement Learning Framework
    Luo, Zhengping
    Hou, Tao
    Nguyen, Tung Thanh
    Zeng, Hui
    Lu, Zhuo
    IEEE INFOCOM 2020 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2020, : 550 - 555
  • [7] A Bespoke Workflow Management System for Data-Driven Urgent HPC
    Gibb, Gordon P. S.
    Brown, Nick
    Nash, Rupert W.
    Mendes, Miguel
    Monedero, Santiago
    Diaz Fidalgo, Humberto
    Ramirez Cisneros, Joaquin
    Cardil, Adrian
    Kontak, Max
    PROCEEDINGS OF URGENTHPC 2020: THE IEEE/ACM INTERNATIONAL WORKSHOPS ON URGENT AND INTERACTIVE HPC, 2020, : 10 - 20
  • [8] A data-driven simulation-optimization framework for generating priority dispatching rules in dynamic job shop scheduling with uncertainties
    Wang, Hao
    Peng, Tao
    Nassehi, Aydin
    Tang, Renzhong
    JOURNAL OF MANUFACTURING SYSTEMS, 2023, 70 : 288 - 308
  • [9] DATA-DRIVEN TEST SYSTEMS
    LANDIS, AS
    HEWLETT-PACKARD JOURNAL, 1994, 45 (04): : 62 - 66
  • [10] A Data-Driven Dispatching Approach for Sustainable Exploitation of Demand Response Resources
    Zeng, Bo
    Wei, Xuan
    Feng, Jiahuan
    2018 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CONTROL, AND COMPUTING TECHNOLOGIES FOR SMART GRIDS (SMARTGRIDCOMM), 2018,