A Conceptual Framework for HPC Operational Data Analytics

被引:6
|
作者
Netti, Alessio [1 ]
Shin, Woong [2 ]
Ott, Michael [1 ]
Wilde, Torsten [3 ]
Bates, Natalie [4 ]
机构
[1] Leibniz Supercomp Ctr, Garching, Germany
[2] Oak Ridge Natl Lab, Oak Ridge, TN USA
[3] Hewlett Packard Enterprise, Houston, TX USA
[4] Energy Efficient HPC Working Grp, Houston, TX USA
基金
欧盟地平线“2020”;
关键词
Exascale; Top500; HPC operations; Energy efficiency; Operational data; PERFORMANCE; PREDICTION; MANAGEMENT; EFFICIENCY;
D O I
10.1109/Cluster48925.2021.00086
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper provides a broad framework for understanding trends in Operational Data Analytics (ODA) for High-Performance Computing (HPC) facilities. The goal of ODA is to allow for the continuous monitoring, archiving, and analysis of near real-time performance data, providing immediately actionable information for multiple operational uses. In this work, we combine two models to provide a comprehensive HPC ODA framework: one is an evolutionary model of analytics capabilities that consists of four types, which are descriptive, diagnostic, predictive and prescriptive, while the other is a four-pillar model for energy-efficient HPC operations that covers facility, system hardware, system software, and applications. This new framework is then overlaid with a description of current development and production deployments of ODA within leading-edge HPC facilities. Finally, we perform a comprehensive survey of ODA works and classify them according to our framework, in order to demonstrate its effectiveness.
引用
收藏
页码:596 / 603
页数:8
相关论文
共 50 条
  • [1] The framework of parametric and nonparametric operational data analytics
    Feng, Qi
    Shanthikumar, J. George
    [J]. PRODUCTION AND OPERATIONS MANAGEMENT, 2023, 32 (09) : 2685 - 2703
  • [2] Autonomy Loops for Monitoring, Operational Data Analytics, Feedback, and Response in HPC Operations
    Boito, Francieli
    Brandt, Jim
    Cardellini, Valeria
    Carns, Philip
    Ciorba, Florina M.
    Egan, Hilary
    Eleliemy, Ahmed
    Gentile, Ann
    Gruber, Thomas
    Hanson, Jeff
    Haus, Utz-Uwe
    Huck, Kevin
    Ilsche, Thomas
    Jakobsche, Thomas
    Jones, Terry
    Karlsson, Sven
    Mueen, Abdullah
    Ott, Michael
    Patki, Tapasya
    Peng, Ivy
    Raghavan, Krishnan
    Simms, Stephen
    Shoga, Kathleen
    Showerman, Michael
    Tiwari, Devesh
    Wilde, Torsten
    Yamamoto, Keiji
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING WORKSHOPS, CLUSTER WORKSHOPS, 2023, : 37 - 43
  • [3] A Conceptual Framework for Democratization of Big Data Analytics (BDA)
    Ismail, Nor Haslinda
    Alias, Rose Alinda
    [J]. ADVANCES ON INTELLIGENT INFORMATICS AND COMPUTING: HEALTH INFORMATICS, INTELLIGENT SYSTEMS, DATA SCIENCE AND SMART COMPUTING, 2022, 127 : 707 - 712
  • [4] Log Analytics in HPC: A Data-driven Reinforcement Learning Framework
    Luo, Zhengping
    Hou, Tao
    Nguyen, Tung Thanh
    Zeng, Hui
    Lu, Zhuo
    [J]. IEEE INFOCOM 2020 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2020, : 550 - 555
  • [5] Operational Data Analytics in practice: Experiences from design to deployment in production HPC environments
    Netti, Alessio
    Ott, Michael
    Guillen, Carla
    Tafani, Daniele
    Schulz, Martin
    [J]. PARALLEL COMPUTING, 2022, 113
  • [6] Business-driven data analytics: A conceptual modeling framework
    Nalchigar, Soroosh
    Yu, Eric
    [J]. DATA & KNOWLEDGE ENGINEERING, 2018, 117 : 359 - 372
  • [7] Conceptual Framework for Implementing Temporal Big Data Analytics in Companies
    Mach-Krol, Maria
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (23):
  • [8] FPGA Accelerated HPC and Data Analytics
    Strickland, Mike
    [J]. 2018 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT 2018), 2018, : 1 - 1
  • [9] Advancing manufacturing systems with big-data analytics: A conceptual framework
    Kozjek, Dominik
    Vrabic, Rok
    Rihtarsic, Borut
    Lavrac, Nada
    Butala, Peter
    [J]. INTERNATIONAL JOURNAL OF COMPUTER INTEGRATED MANUFACTURING, 2020, 33 (02) : 169 - 188
  • [10] Big data architecture for construction waste analytics (CWA): A conceptual framework
    Bilal, Muhammad
    Oyedele, Lukumon O.
    Akinade, Olugbenga O.
    Ajayi, Saheed O.
    Alaka, Hafiz A.
    Owolabi, Hakeem A.
    Qadir, Junaid
    Pasha, Maruf
    Bello, Sururah A.
    [J]. JOURNAL OF BUILDING ENGINEERING, 2016, 6 : 144 - 156