IKAROS: A scalable I/O framework for high-performance computing systems.

被引:2
|
作者
Filippidis, Christos [1 ]
Tsanakas, Panayiotis [2 ]
Cotronis, Yiannis [1 ]
机构
[1] Univ Athens, Dept Informat & Telecommun, Athens, Greece
[2] Natl Tech Univ Athens, Sch Elect & Comp Engn, Athens, Greece
关键词
Data management; Storage; Distributed systems; Parallel file systems; High performance computing; Exascale systems; GPFS; Soho-NAS; Grid computing;
D O I
10.1016/j.jss.2016.05.027
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
High performance computing (HPC) has crossed the Petaflop mark and is reaching the Exaflop range quickly. The exascale system is projected to have millions of nodes, with thousands of cores for each node. At such an extreme scale, the substantial amount of concurrency can cause a critical contention issue for I/O system. This study proposes a dynamically coordinated I/O architecture for addressing some of the limitations that current parallel file systems and storage architectures are facing with very largescale systems. The fundamental idea is to coordinate I/O accesses according to the topology/profile of the infrastructure, the load metrics, and the I/O demands of each application. The measurements have shown that by using IKAROS approach we can fully utilize the provided I/O and network resources, minimize disk and network contention, and achieve better performance. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:277 / 287
页数:11
相关论文
共 50 条
  • [1] Scalable I/O Forwarding Framework for High-Performance Computing Systems
    Ali, Nawab
    Carns, Philip
    Iskra, Kamil
    Kimpe, Dries
    Lang, Samuel
    Latham, Robert
    Ross, Robert
    Ward, Lee
    Sadayappan, P.
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING AND WORKSHOPS, 2009, : 86 - +
  • [2] A Scalable Runtime Fault Localization Framework for High-Performance Computing Systems
    Gao, Jian
    Wei, Hongmei
    Yu, Kang
    Qing, Peng
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2018, 46 (04) : 749 - 761
  • [3] A Scalable Runtime Fault Localization Framework for High-Performance Computing Systems
    Jian Gao
    Hongmei Wei
    Kang Yu
    Peng Qing
    [J]. International Journal of Parallel Programming, 2018, 46 : 749 - 761
  • [4] Scalable Approach to Failure Analysis of High-Performance Computing Systems
    Shawky, Doaa
    [J]. ETRI JOURNAL, 2014, 36 (06) : 1023 - 1031
  • [5] A scalable framework for online power modelling of high-performance computing nodes in production
    Pittino, Federico
    Beneventi, Francesco
    Bartolini, Andrea
    Benini, Luca
    [J]. PROCEEDINGS 2018 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2018, : 300 - 307
  • [6] An Extended IMS Framework With a High-Performance and Scalable Distributed Storage and Computing System
    Seraoui, Youssef
    Raouyane, Brahim
    Bellafkih, Mostafa
    [J]. 2017 INTERNATIONAL SYMPOSIUM ON NETWORKS, COMPUTERS AND COMMUNICATIONS (ISNCC), 2017,
  • [7] Scalable Embedded Systems: Towards the Convergence of High-Performance and Embedded Computing
    Giorgi, Roberto
    [J]. PROCEEDINGS IEEE/IFIP 13TH INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING 2015, 2015, : 148 - 153
  • [8] Modeling I/O performance variability in high-performance computing systems using mixture distributions
    Xu, Li
    Wang, Yueyao
    Lux, Thomas
    Chang, Tyler
    Bernard, Jon
    Li, Bo
    Hong, Yili
    Cameron, Kirk
    Watson, Layne
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2020, 139 : 87 - 98
  • [9] Hierarchical Collective I/O Scheduling for High-Performance Computing
    Liu, Jialin
    Zhuang, Yu
    Chen, Yong
    [J]. BIG DATA RESEARCH, 2015, 2 (03) : 117 - 126
  • [10] A Checkpoint of Research on Parallel I/O for High-Performance Computing
    Boito, Francieli Zanon
    Inacio, Eduardo C.
    Bez, Jean Luca
    Navaux, Philippe O. A.
    Dantas, Mario A. R.
    Denneulin, Yves
    [J]. ACM COMPUTING SURVEYS, 2018, 51 (02)