IKAROS: A scalable I/O framework for high-performance computing systems.

被引：2

作者：

Filippidis, Christos ^{[1
]}

Tsanakas, Panayiotis ^{[2
]}

Cotronis, Yiannis ^{[1
]}

机构：

[1] Univ Athens, Dept Informat & Telecommun, Athens, Greece

[2] Natl Tech Univ Athens, Sch Elect & Comp Engn, Athens, Greece

来源：

JOURNAL OF SYSTEMS AND SOFTWARE | 2016年 / 118卷

关键词：

Data management; Storage; Distributed systems; Parallel file systems; High performance computing; Exascale systems; GPFS; Soho-NAS; Grid computing;

D O I：

10.1016/j.jss.2016.05.027

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

High performance computing (HPC) has crossed the Petaflop mark and is reaching the Exaflop range quickly. The exascale system is projected to have millions of nodes, with thousands of cores for each node. At such an extreme scale, the substantial amount of concurrency can cause a critical contention issue for I/O system. This study proposes a dynamically coordinated I/O architecture for addressing some of the limitations that current parallel file systems and storage architectures are facing with very largescale systems. The fundamental idea is to coordinate I/O accesses according to the topology/profile of the infrastructure, the load metrics, and the I/O demands of each application. The measurements have shown that by using IKAROS approach we can fully utilize the provided I/O and network resources, minimize disk and network contention, and achieve better performance. (C) 2016 Elsevier Inc. All rights reserved.

引用

页码：277 / 287

页数：11

共 50 条

[1] Scalable I/O Forwarding Framework for High-Performance Computing Systems
Ali, Nawab
Carns, Philip
Iskra, Kamil
Kimpe, Dries
Lang, Samuel
Latham, Robert
Ross, Robert
Ward, Lee
Sadayappan, P.
[J]. 2009 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING AND WORKSHOPS, 2009, : 86 - +
[2] A Scalable Runtime Fault Localization Framework for High-Performance Computing Systems
Gao, Jian
Wei, Hongmei
Yu, Kang
Qing, Peng
[J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2018, 46 (04) : 749 - 761
[3] A Scalable Runtime Fault Localization Framework for High-Performance Computing Systems
Jian Gao
Hongmei Wei
Kang Yu
Peng Qing
[J]. International Journal of Parallel Programming, 2018, 46 : 749 - 761
[4] Scalable Approach to Failure Analysis of High-Performance Computing Systems
Shawky, Doaa
[J]. ETRI JOURNAL, 2014, 36 (06) : 1023 - 1031
[5] A scalable framework for online power modelling of high-performance computing nodes in production
Pittino, Federico
Beneventi, Francesco
Bartolini, Andrea
Benini, Luca
[J]. PROCEEDINGS 2018 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2018, : 300 - 307
[6] An Extended IMS Framework With a High-Performance and Scalable Distributed Storage and Computing System
Seraoui, Youssef
Raouyane, Brahim
Bellafkih, Mostafa
[J]. 2017 INTERNATIONAL SYMPOSIUM ON NETWORKS, COMPUTERS AND COMMUNICATIONS (ISNCC), 2017,
[7] Scalable Embedded Systems: Towards the Convergence of High-Performance and Embedded Computing
Giorgi, Roberto
[J]. PROCEEDINGS IEEE/IFIP 13TH INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING 2015, 2015, : 148 - 153
[8] Modeling I/O performance variability in high-performance computing systems using mixture distributions
Xu, Li
Wang, Yueyao
Lux, Thomas
Chang, Tyler
Bernard, Jon
Li, Bo
Hong, Yili
Cameron, Kirk
Watson, Layne
[J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2020, 139 : 87 - 98
[9] Hierarchical Collective I/O Scheduling for High-Performance Computing
Liu, Jialin
Zhuang, Yu
Chen, Yong
[J]. BIG DATA RESEARCH, 2015, 2 (03) : 117 - 126
[10] A Checkpoint of Research on Parallel I/O for High-Performance Computing
Boito, Francieli Zanon
Inacio, Eduardo C.
Bez, Jean Luca
Navaux, Philippe O. A.
Dantas, Mario A. R.
Denneulin, Yves
[J]. ACM COMPUTING SURVEYS, 2018, 51 (02)

← 1 2 3 4 5 →