HDF5 in the exascale era: Delivering efficient and scalable parallel I/O for exascale applications

被引:0
|
作者
Scot Breitenfeld, M. [1 ]
Tang, Houjun [2 ]
Zheng, Huihuo [3 ]
Henderson, Jordan [1 ]
Byna, Suren [2 ,4 ]
机构
[1] The HDF Group, Champaign,IL, United States
[2] Scientific Data Division, Lawrence Berkeley National Laboratory, Berkeley,CA, United States
[3] Argonne Leadership Computing Facility, Argonne National Laboratory, Lemont,IL, United States
[4] Department of Computer Science and Engineering, The Ohio State University, Columbus,OH, United States
关键词
Cache memory - Hierarchical systems - Parallel architectures - Storage efficiency - Storage management;
D O I
10.1177/10943420241288244
中图分类号
学科分类号
摘要
Accurately modeling real-world systems requires scientific applications at exascale to generate massive amounts of data and manage data storage efficiently. However, parallel input and output (I/O) faces challenges due to new application workflows and the state-of-the-art memory, interconnect, and storage architectures considered in exascale designs. The storage hierarchy has expanded with node-local persistent memory, solid-state storage, and traditional disk and tape-based storage, thus requiring efficiency at each layer and much more efficient data movement among these layers. This paper discusses how the ExaHDF5 project improved the I/O performance and data management for exascale architectures by enhancing HDF5, a widely used parallel I/O library. The team developed an Asynchronous I/O Virtual Object Layer (VOL) connector that allowed overlapping I/O with computation. They also created a Cache VOL to complement asynchronous I/O by incorporating fast storage layers, such as burst buffer and node-local storage, into the parallel I/O workflow through caching and staging data. Additionally, the team enabled data aggregation and I/O at the node level by using a Subfiling Virtual File Driver (VFD). To demonstrate superior I/O performance with HDF5 at exascale, the ExaHDF5 team collaborated with several exascale applications. In this paper, we show I/O performance improvements for three applications: Cabana (a particle-based simulation library), EQSIM (a regional earthquake simulation software), and E3SM (a climate system modeling library). © The Author(s) 2024.
引用
收藏
页码:65 / 78
相关论文
共 17 条
  • [1] Accelerating HDF5 I/O for Exascale Using DAOS
    Soumagne, Jerome
    Henderson, Jordan
    Chaarawi, Mohamad
    Fortner, Neil
    Breitenfeld, Scot
    Lu, Songyu
    Robinson, Dana
    Pourmal, Elena
    Lombardi, Johann
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (04) : 903 - 914
  • [2] ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems
    Byna, Suren
    Breitenfeld, M. Scot
    Dong, Bin
    Koziol, Quincey
    Pourmal, Elena
    Robinson, Dana
    Soumagne, Jerome
    Tang, Houjun
    Vishwanath, Venkatram
    Warren, Richard
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2020, 35 (01) : 145 - 160
  • [3] ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems
    Suren Byna
    M. Scot Breitenfeld
    Bin Dong
    Quincey Koziol
    Elena Pourmal
    Dana Robinson
    Jerome Soumagne
    Houjun Tang
    Venkatram Vishwanath
    Richard Warren
    Journal of Computer Science and Technology, 2020, 35 : 145 - 160
  • [4] <monospace>h5bench</monospace>: A unified benchmark suite for evaluating HDF5 I/O performance on pre-exascale platforms
    Bez, Jean Luca
    Tang, Houjun
    Breitenfeld, Scot
    Zheng, Huihuo
    Liao, Wei-Keng
    Hou, Kaiyuan
    Huang, Zanhua
    Byna, Suren
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2024, 36 (16):
  • [5] HDF5 Cache VOL: Efficient and Scalable Parallel I/O through Caching Data on Node-local Storage
    Zheng, Huihuo
    Vishwanath, Venkatram
    Koziol, Quincey
    Tang, Houjun
    Ravi, John
    Mainzer, John
    Byna, Suren
    2022 22ND IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2022), 2022, : 61 - 70
  • [6] GPU Direct I/O with HDF5
    Ravi, John
    Byna, Suren
    Koziol, Quincey
    PROCEEDINGS OF 2020 IEEE/ACM FIFTH INTERNATIONAL PARALLEL DATA SYSTEMS WORKSHOP (PDSW 2020), 2020, : 28 - 33
  • [7] Design and optimisation of an efficient HDF5 I/O Kernel for massive parallel fluid flow simulations
    Ertl, Christoph
    Frisch, Jerome
    Mundani, Ralf-Peter
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (24):
  • [8] Modeling pre-Exascale AMR Parallel I/O Workloads via Proxy Applications
    Godoy, William F.
    Delozier, Jenna
    Watson, Gregory R.
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2022), 2022, : 952 - 961
  • [9] Auto-Tuning of Parallel IO Parameters for HDF5 Applications
    Behzad, Babak
    Huchette, Joey
    Huong Luu
    Aydt, Ruth
    Koziol, Quincey
    Prabhat
    Byna, Suren
    Chaarawi, Mohamad
    Yao, Yushu
    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 1429 - 1430
  • [10] Boosting I/O and visualization for exascale era using Hercule: test case on RAMSES
    Strafella, Loic
    Chapon, Damien
    14TH INTERNATIONAL CONFERENCE ON NUMERICAL MODELING OF SPACE PLASMA FLOWS (ASTRONUM-2019), 2020, 1623