Learning Based Performance and Power Efficient Cluster Resource Manager for CPU-GPU Cluster

被引:1
|
作者
Das, Soumen Kumar [1 ]
Sudhakaran, G. [1 ]
Ashok, V. [1 ]
机构
[1] ISRO, Vikram Sarabhai Space Ctr, Govt India, Dept Space, Trivandrum, Kerala, India
关键词
High performance Cluster; CRM; Moldable Scheduler; Collocation; Resource Manager; petascale; green computing;
D O I
10.1109/EAIT.2014.58
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The recent success in building petascale High Performance Computing (HPC) systems have produced the demand for efficient and optimized use of resources to increase the performance and reduce the power consumption. Including the above, the heterogeneous architectures of nowadays HPCs comprising a multicore CPU and many-core Accelerator like GPU(s) are facing another concern for using optimum utilization of each of these components. This paper presents the scheduling mechanism of the Cluster Resource Manager (CRM): i. Moldable job Scheduler (MS) which is able to mold the jobs with respect to the number of machines based on an preliminary initialized and auto updated heuristic knowledge-base of problem size, optimum machine count, execution duration to increase the utilization of the full cluster facility. ii) Collocation Aware and Power Efficient Resource Manager (CAPE-RM) manages collocation of CPU only and GPU accelerated jobs by monitoring the CPU load and memory usage. The emerging computation ability is followed by the huge amount of power consumption. Though the use of GPU(s) itself cut down the power to be needed by the only CPU based cluster but to make a green computing facility more power efficiency is desired. The CAPE-RM is designed to support the above by powering off the idle nodes by monitoring the total load to the facility and based on a simple statistic of the frequency of job submission.
引用
收藏
页码:161 / 166
页数:6
相关论文
共 50 条
  • [41] Performance Optimization for CPU-GPU Heterogeneous Parallel System
    Wang, Yanhua
    Qiao, Jianzhong
    Lin, Shukuan
    Zhao, Tinglei
    2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 1259 - 1266
  • [42] Machine Learning Based Predictive Models in Mobile Platforms Using CPU-GPU
    Sohankar, Javad
    Pore, Madhurima
    Banerjee, Ayan
    Sadeghi, Koosha
    Gupta, Sandeep K. S.
    2020 7TH INTERNATIONAL CONFERENCE ON INTERNET OF THINGS: SYSTEMS, MANAGEMENT AND SECURITY (IOTSMS), 2020,
  • [43] GSched: An efficient scheduler for hybrid CPU-GPU HPC systems
    Mateos, Mariano Raboso
    Robles, Juan Antonio Cotobal
    1600, Springer Verlag (217): : 179 - 185
  • [44] Deep learning based data prefetching in CPU-GPU unified virtual memory
    Long, Xinjian
    Gong, Xiangyang
    Zhang, Bo
    Zhou, Huiyang
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2023, 174 : 19 - 31
  • [45] HETEROGENEOUS DESIGN AND EFFICIENT CPU-GPU IMPLEMENTATION OF COLLISION DETECTION
    Tayyub, Mohid
    Khan, Gul N.
    IADIS-INTERNATIONAL JOURNAL ON COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2019, 14 (02): : 25 - 40
  • [46] Power and Performance Optimal NoC Design for CPU-GPU Architecture Using Formal Models
    Alhubail, Lulwah
    Bagherzadeh, Nader
    2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 634 - 637
  • [47] High Performance FFT Based Poisson Solver on a CPU-GPU Heterogeneous Platform
    Wu, Jing
    JaJa, Joseph
    IEEE 27TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2013), 2013, : 115 - 125
  • [48] Accelerating Batched Power Flow on Heterogeneous CPU-GPU Platform
    Hao, Jiao
    Zhang, Zongbao
    He, Zonglin
    Liu, Zhengyuan
    Tan, Zhengdong
    Song, Yankan
    Energies, 2024, 17 (24)
  • [49] A Runtime Workload Distribution with Resource Allocation for CPU-GPU Heterogeneous Systems
    Alsubaihi, Shouq
    Gaudiot, Jean-Luc
    2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 994 - 1003
  • [50] Research on LogGP Based Parallel Computing Model for CPU/GPU Cluster
    Wu, Yongwen
    Song, Junqiang
    Ren, Kaijun
    Li, Xiaoyong
    INFORMATION TECHNOLOGY AND INTELLIGENT TRANSPORTATION SYSTEMS, VOL 2, 2017, 455 : 409 - 420