Topology-Aware Data Aggregation for Intensive I/O on Large-Scale Supercomputers

被引:0
|
作者
Tessier, Francois [1 ]
Malakar, Preeti [1 ]
Vishwanath, Venkatram [1 ]
Jeannot, Emmanuel [2 ]
Isaila, Florin [3 ]
机构
[1] Argonne Natl Lab, Argonne Leadership Comp Facil, Lemont, IL 60439 USA
[2] Inria Bordeaux Sud Ouest, Talence, France
[3] Univ Carlos III, Madrid, Spain
关键词
D O I
10.1109/COM-HPC.2016.13
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Reading and writing data efficiently from storage systems is critical for high performance data-centric applications. These I/O systems are being increasingly characterized by complex topologies and deeper memory hierarchies. Effective parallel I/O solutions are needed to scale applications on current and future supercomputers. Data aggregation is an efficient approach consisting of electing some processes in charge of aggregating data from a set of neighbors and writing the aggregated data into storage. Thus, the bandwidth use can be optimized while the contention is reduced. In this work, we take into account the network topology for mapping aggregators and we propose an optimized buffering system in order to reduce the aggregation cost. We validate our approach using micro-benchmarks and the I/O kernel of a large-scale cosmology simulation. We show improvements up to 15x faster for I/O operations compared to a standard implementation of MPI I/O.
引用
收藏
页码:73 / 81
页数:9
相关论文
共 50 条
  • [41] Topology-aware VM Placement for Network Optimization in Cloud Data Centers
    Lian, Zhen
    Li, Xin
    Qin, Xiaolin
    [J]. 2017 15TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS AND 2017 16TH IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS (ISPA/IUCC 2017), 2017, : 558 - 565
  • [42] Energy Modeling of Supercomputers and Large-Scale Scientific Applications
    Pakin, Scott
    Lang, Michael
    [J]. 2013 INTERNATIONAL GREEN COMPUTING CONFERENCE (IGCC), 2013,
  • [43] SchedP: I/O-aware Job Scheduling in Large-Scale Production HPC Systems
    Wu, Kaiyue
    Wei, Jianwen
    Lin, James
    [J]. NETWORK AND PARALLEL COMPUTING, NPC 2022, 2022, 13615 : 315 - 326
  • [44] Entropy-Aware I/O Pipelining for Large-Scale Deep Learning on HPC Systems
    Zhu, Yue
    Chowdhury, Fahim
    Fu, Huansong
    Moody, Adam
    Mohror, Kathryn
    Sato, Kento
    Yu, Weikuan
    [J]. 2018 IEEE 26TH INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS, AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS), 2018, : 145 - 156
  • [45] HHS: an efficient network topology for large-scale data centers
    Sadoon Azizi
    Naser Hashemi
    Ahmad Khonsari
    [J]. The Journal of Supercomputing, 2016, 72 : 874 - 899
  • [46] Tuning Parallel Data Compression and I/O for Large-scale Earthquake Simulation
    Tang, Houjun
    Byna, Suren
    Petersson, N. Anders
    McCallen, David
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 2992 - 2997
  • [47] HHS: an efficient network topology for large-scale data centers
    Azizi, Sadoon
    Hashemi, Naser
    Khonsari, Ahmad
    [J]. JOURNAL OF SUPERCOMPUTING, 2016, 72 (03): : 874 - 899
  • [48] Topology-Aware Data Aggregation for High Performance Collective MPI-IO on a Multi-Core Cluster System
    Tsujita, Yuichi
    Hori, Atsushi
    Kameyama, Toyohisa
    Ishikawa, Yutaka
    [J]. 2016 FOURTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR), 2016, : 37 - 46
  • [49] Distributed Data Provenance for Large-Scale Data-Intensive Computing
    Zhao, Dongfang
    Shou, Chen
    Malik, Tanu
    Raicu, Ioan
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2013,
  • [50] Designing Topology-Aware Communication Schedules for Alltoall Operations in Large InfiniBand Clusters
    Subramoni, H.
    Kandalla, K.
    Jose, J.
    Tomko, K.
    Schulz, K.
    Pekurovsky, D.
    Panda, D. K.
    [J]. 2014 43RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2014, : 231 - 240