I/O optimization in the checkpointing of OpenMP parallel applications

被引:2
|
作者
Losada, Nuria [1 ]
Martin, Maria J. [1 ]
Rodriguez, Gabriel [1 ]
Gonzalez, Patricia [1 ]
机构
[1] Univ A Coruna, Comp Architecture Grp, La Coruna, Spain
关键词
OpenMP; Fault Tolerance; Checkpointing; FAULT-TOLERANCE; CPPC; TOOL;
D O I
10.1109/PDP.2015.39
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Despite the increasing popularity of shared-memory systems, there is a lack of tools for providing fault tolerance support to shared-memory applications. Checkpointing is one of the most popular fault tolerance techniques. However, checkpointing cost in terms of computing time, network utilization or storage resources can be a limitation for its practical use. This work proposes different techniques for the optimization of the I/O cost in the checkpointing of shared-memory parallel applications. The proposals are extensively evaluated using the OpenMP NAS Parallel Benchmarks. Results show a significant decrease of the checkpointing overhead.
引用
收藏
页码:222 / 229
页数:8
相关论文
共 50 条
  • [1] Multi-Threaded Parallel I/O for OpenMP Applications
    Mehta, Kshitij
    Gabriel, Edgar
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2015, 43 (02) : 286 - 309
  • [2] Multi-Threaded Parallel I/O for OpenMP Applications
    Kshitij Mehta
    Edgar Gabriel
    [J]. International Journal of Parallel Programming, 2015, 43 : 286 - 309
  • [3] Extending OpenMP for the Optimization of Parallel Component Applications
    Peng, Yunfeng
    Liu, Hai
    [J]. IEEE ACCESS, 2020, 8 : 95435 - 95441
  • [4] Compiler-Enhanced Incremental Checkpointing for OpenMP Applications
    Bronevetsky, Greg
    Marques, Daniel
    Pingali, Keshav
    McKee, Sally
    Rugina, Radu
    [J]. 2009 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-5, 2009, : 189 - +
  • [5] Distributed implementation of OpenMP based on checkpointing aided parallel execution
    Renault, Eric
    [J]. PRACTICAL PROGRAMMING MODEL FOR THE MULTI-CORE ERA, PROCEEDINGS, 2008, 4935 : 195 - 206
  • [6] Optimization of checkpointing-related I/O for high-performance parallel and distributed computing
    Subramaniyan, Rajagopal
    Grobelny, Eric
    Studham, Scott
    George, Alan D.
    [J]. JOURNAL OF SUPERCOMPUTING, 2008, 46 (02): : 150 - 180
  • [7] Optimization of checkpointing-related I/O for high-performance parallel and distributed computing
    Rajagopal Subramaniyan
    Eric Grobelny
    Scott Studham
    Alan D. George
    [J]. The Journal of Supercomputing, 2008, 46 : 150 - 180
  • [8] Compiler-Enhanced Incremental Checkpointing for OpenMP Applications
    Bronevetsky, Greg
    Marques, Daniel
    Pingali, Keshav
    Rugina, Radu
    McKee, Sally A.
    [J]. PPOPP'08: PROCEEDINGS OF THE 2008 ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, 2008, : 275 - 276
  • [9] Mitigating I/O Impact of Checkpointing on Large Scale Parallel Systems
    Wang, Nana
    Sun, Qingzheng
    Liu, Yi
    Qian, Depei
    [J]. IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 117 - 123
  • [10] Checkpointing of parallel applications in a Grid environment
    Sajadah, Kreeteeraj
    Terstyansky, Gabor
    Winter, Stephen C.
    Kacsuk, Peter
    [J]. DISTRIBUTED AND PARALLEL SYSTEMS: IN FOCUS: DESKTOP GRID COMPUTING, 2008, : 179 - +