Prediction of job characteristics for intelligent resource allocation in HPC systems: a survey and future directions

被引:0
|
作者
Zhengxiong Hou
Hong Shen
Xingshe Zhou
Jianhua Gu
Yunlan Wang
Tianhai Zhao
机构
[1] Northwestern Polytechnical University,Center for High Performance Computing, School of Computer Science
[2] Sun Yat-Sen University,School of Computer Science and Engineering
来源
关键词
high-performance computing; performance prediction; job characteristics; intelligent resource allocation; cloud computing; machine learning;
D O I
暂无
中图分类号
学科分类号
摘要
Nowadays, high-performance computing (HPC) clusters are increasingly popular. Large volumes of job logs recording many years of operation traces have been accumulated. In the same time, the HPC cloud makes it possible to access HPC services remotely. For executing applications, both HPC end-users and cloud users need to request specific resources for different workloads by themselves. As users are usually not familiar with the hardware details and software layers, as well as the performance behavior of the underlying HPC systems. It is hard for them to select optimal resource configurations in terms of performance, cost, and energy efficiency. Hence, how to provide on-demand services with intelligent resource allocation is a critical issue in the HPC community. Prediction of job characteristics plays a key role for intelligent resource allocation. This paper presents a survey of the existing work and future directions for prediction of job characteristics for intelligent resource allocation in HPC systems. We first review the existing techniques in obtaining performance and energy consumption data of jobs. Then we survey the techniques for single-objective oriented predictions on runtime, queue time, power and energy consumption, cost and optimal resource configuration for input jobs, as well as multi-objective oriented predictions. We conclude after discussing future trends, research challenges and possible solutions towards intelligent resource allocation in HPC systems.
引用
收藏
相关论文
共 50 条
  • [1] Prediction of job characteristics for intelligent resource allocation in HPC systems: a survey and future directions
    Hou, Zhengxiong
    Shen, Hong
    Zhou, Xingshe
    Gu, Jianhua
    Wang, Yunlan
    Zhao, Tianhai
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2022, 16 (05)
  • [2] Prediction of job characteristics for intelligent resource allocation in HPC systems:a survey and future directions
    Zhengxiong HOU
    Hong SHEN
    Xingshe ZHOU
    Jianhua GU
    Yunlan WANG
    Tianhai ZHAO
    [J]. Frontiers of Computer Science., 2022, 16 (05) - 37
  • [3] EVALIX: Classification and Prediction of Job Resource Consumption on HPC Platforms
    Emeras, Joseph
    Varrette, Sebastien
    Guzek, Mateusz
    Bouvry, Pascal
    [J]. JOB SCHEDULING STRATEGIES FOR PARALLEL PROCESSING, JSSPP 2016, 2017, 10353 : 102 - 122
  • [4] Heterogeneity-Aware Resource Allocation in HPC Systems
    Netti, Alessio
    Galleguillos, Cristian
    Kiziltan, Zeynep
    Sirbu, Alina
    Babaoglu, Ozalp
    [J]. HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2018, 2018, 10876 : 3 - 21
  • [5] Reliability-Aware Resource Allocation in HPC Systems
    Gottumukkala, Narasimha Raju
    Leangsuksun, Chokchai Box
    Taerat, Narate
    Nassar, Raja
    Scott, Stephen L.
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, 2007, : 312 - +
  • [6] Intelligent Prediction Method for Transport Resource Allocation
    Kong, Yan
    Pan, Shuzhen
    [J]. SENSORS AND MATERIALS, 2019, 31 (06) : 1917 - 1925
  • [7] Optimizing Communication and Cooling Costs in HPC Data Centers via Intelligent Job Allocation
    Kaplan, Fulya
    Meng, Jie
    Coskun, Ayse K.
    [J]. 2013 INTERNATIONAL GREEN COMPUTING CONFERENCE (IGCC), 2013,
  • [8] Intelligent Resource Allocation in Wireless Communications Systems
    Lee, Woongsup
    Jo, Ohyun
    Kim, Minhoe
    [J]. IEEE COMMUNICATIONS MAGAZINE, 2020, 58 (01) : 100 - 105
  • [9] A Survey on Requirements of Future Intelligent Networks: Solutions and Future Research Directions
    Husen, Arif
    Chaudary, Muhammad Hasanain
    Ahmad, Farooq
    [J]. ACM COMPUTING SURVEYS, 2023, 55 (04)
  • [10] Cost-Effectiveness and Resource Allocation (CERA) – directions for the future
    Rob Baltussen
    Arnab Acharya
    Kathryn Antioch
    Dan Chisholm
    Richard Grieve
    Joses Kirigia
    Tessa Tan Torres-Edejer
    Damian G Walker
    David Evans
    [J]. Cost Effectiveness and Resource Allocation, 7 (1)