A Simple Resource Usage Monitor for Users of PBS and Slurm

被引:0
|
作者
Dawson, Douglas [1 ]
机构
[1] Clemson Univ, Clemson, SC 29634 USA
基金
美国国家科学基金会;
关键词
Job monitoring; Job Efficiency; Slurm; PBS;
D O I
10.1145/3626203.3670608
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The Research Computing and Data team at Clemson University has been working to improve resource utilization on the Palmetto Cluster. We developed a tool we call jobperf to provide users an easy way to view resource usage of jobs. Originally developed for PBS, Slurm support was also added as the Palmetto Cluster transitioned schedulers. Jobperf provides both a CLI and web interface to view real-time, per-node CPU, GPU and memory usage statistics. The tool was developed to be portable, requiring no cluster-wide components and easily installable without administrator intervention. This was tested on five Slurm clusters available through ACCESS-CI in addition to Palmetto, and shown to work well on three of these external systems. We have released the tool to the community as open source and we hope this tool is useful to other institutions.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] SLURM: Simple linux utility for resource management
    Yoo, AB
    Jette, MA
    Grondona, M
    JOB SCHEDULING STRATEGIES FOR PARALLEL PROCESSING, 2003, 2862 : 44 - 60
  • [2] How to monitor and control resource usage in mobile agent systems
    Bellavista, P
    Corradi, A
    Stefanelli, C
    DOA'01: 3RD INTERNATIONAL SYMPOSIUM ON DISTRIBUTED OBJECTS & APPLICATIONS, PROCEEDINGS, 2001, : 65 - 75
  • [3] Energy Accounting and Control with SLURM Resource and Job Management System
    Georgiou, Yiannis
    Cadeau, Thomas
    Glesser, David
    Auble, Danny
    Jette, Morris
    Hautreux, Matthieu
    DISTRIBUTED COMPUTING AND NETWORKING, ICDCN 2014, 2014, 8314 : 96 - 118
  • [4] Cluster Usage Policy Enforcement Using Slurm Plugins and an HTTP API
    Li, Matthew
    Chan, Nicolas
    Chandra, Viraat
    Muriki, Krishna
    PRACTICE AND EXPERIENCE IN ADVANCED RESEARCH COMPUTING 2020, PEARC 2020, 2020, : 232 - 238
  • [5] Forensic implications of System Resource Usage Monitor (SRUM) data in Windows 8
    Khatri, Yogesh
    DIGITAL INVESTIGATION, 2015, 12 : 53 - 65
  • [6] Extending SLURM for Dynamic Resource-Aware Adaptive Batch Scheduling
    Chadha, Mohak
    John, Jophin
    Gerndt, Michael
    2020 IEEE 27TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC 2020), 2020, : 223 - 232
  • [7] dynamicMF: A Matrix Factorization Approach to Monitor Resource Usage in High Performance Computing Systems
    Sorkunlu, Niyazi
    Duc Thanh Anh Luong
    Chandola, Varun
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 1302 - 1307
  • [8] Introducing new backfill-based scheduler for SLURM resource manager
    Leonenkov, Sergei
    Zhumatiy, Sergey
    4TH INTERNATIONAL YOUNG SCIENTIST CONFERENCE ON COMPUTATIONAL SCIENCE, 2015, 66 : 661 - 669
  • [9] Scotch: Combining Software Guard Extensions and System Management Mode to Monitor Cloud Resource Usage
    Leach, Kevin
    Zhang, Fengwei
    Weimer, Westley
    RESEARCH IN ATTACKS, INTRUSIONS, AND DEFENSES (RAID 2017), 2017, 10453 : 403 - 424
  • [10] How to monitor monthly usage
    Audin, Lindsay
    Engineered Systems, 2002, 19 (02):