I/O-signature-based feature analysis and classification of high-performance computing applications

被引:1
|
作者
Park, Ju-Won [1 ]
Huang, Xin [2 ]
Lee, Jae-Kook [1 ]
Hong, Taeyoung [1 ]
机构
[1] Korea Inst Sci & Technol Informat, 245 Daehak Ro, Daejeon 34141, South Korea
[2] Texas State Univ, Dept Comp Sci, San Marcos, TX 78666 USA
关键词
I/O patterns analysis; Key features; High performance computing;
D O I
10.1007/s10586-023-04139-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The demand for high-performance computing (HPC) resources in computing fields such as machine learning has increased significantly in recent years. Computing power has been growing exponentially to keep up with this demand. However, these gains have not been able to translate to performance improvement in real-world applications. One of the biggest reasons for this is performance degradation in terms of input/output (I/O) due to the increased storage latency and complex parallel I/O architecture of accessing data in distributed storage systems. In this study, we analyze application-specific I/O patterns to gain a deeper understanding of I/O throughput and the interaction between applications and the I/O system. Specifically, we analyze the importance of each feature of I/O patterns through feature analysis based on the collected monitoring information. We also investigate the feasibility of identifying the application based on the analyzed key features. To this end, we present the analysis accuracy and confusion matrix of four algorithms, including random forest, which are widely used as classification algorithms in the experimental results. The experiment results confirm that we can distinguish applications with an accuracy of more than 90% by using application-specific I/O patterns.
引用
收藏
页码:3219 / 3231
页数:13
相关论文
共 50 条
  • [31] High-Performance Cloud Computing: A View of Scientific Applications
    Vecchiola, Christian
    Pandey, Suraj
    Buyya, Rajkumar
    2009 10TH INTERNATIONAL SYMPOSIUM ON PERVASIVE SYSTEMS, ALGORITHMS, AND NETWORKS (ISPAN 2009), 2009, : 4 - 16
  • [32] Data monitoring in high-performance clusters for computing applications
    Torralba, G
    González, V
    Sanchis, E
    Tao, J
    Schulz, M
    Karl, W
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2002, 49 (02) : 525 - 531
  • [33] Harnessing the Crowd for Autotuning High-Performance Computing Applications
    Cho, Younghyun
    Demmel, James W.
    King, Jacob
    Li, Xiaoye S.
    Liu, Yang
    Luo, Hengrui
    2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, IPDPS, 2023, : 635 - 645
  • [34] High-Performance Computing: Fundamental Problems in Industrial Applications
    Chetverushkin, B. N.
    PARALLEL, DISTRIBUTED AND GRID COMPUTING FOR ENGINEERING, 2009, 21 : 369 - 388
  • [35] A continuous benchmarking infrastructure for high-performance computing applications
    Alt, Christoph
    Lanser, Martin
    Plewinski, Jonas
    Janki, Atin
    Klawonn, Axel
    Koestler, Harald
    Selzer, Michael
    Ruede, Ulrich
    INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2024, 39 (04) : 501 - 523
  • [36] Debugging High-Performance Computing Applications at Massive Scales
    Laguna, Ignacio
    Ahn, Dong H.
    de Supinski, Bronis R.
    Gamblin, Todd
    Lee, Gregory L.
    Schulz, Martin
    Bagchi, Saurabh
    Kulkarni, Milind
    Zhou, Bowen
    Chen, Zhezhe
    Qin, Feng
    COMMUNICATIONS OF THE ACM, 2015, 58 (09) : 72 - 81
  • [37] HIGH-PERFORMANCE COMPUTING/COMPUTERS - SIMULATION MODELING AND APPLICATIONS
    OBAIDAT, MS
    SIMULATION, 1993, 61 (03) : 149 - 150
  • [38] Topic 16 -: Applications of high-performance and Grid computing
    Bair, R
    Seidel, E
    Daydé, M
    Palma, JL
    EURO-PAR 2005 PARALLEL PROCESSING, PROCEEDINGS, 2005, 3648 : 1205 - 1205
  • [39] High-performance computing for surface modelling and analysis
    Clematis, A
    Coda, A
    Falcidieno, B
    Spagnuolo, M
    VISUAL COMPUTER, 2000, 16 (01): : 62 - 78
  • [40] Data Analysis and Visualization in High-Performance Computing
    Szczepariski, Amy F.
    Huang, Jian
    Baer, Troy
    Mack, Yashema C.
    Ahern, Sean
    COMPUTER, 2013, 46 (05) : 84 - 92