Globally Precise-restartable Execution of Parallel Programs

被引:0
|
作者
Gupta, Gagan [1 ]
Sridharan, Srinath [1 ]
Sohi, Gurindar S. [1 ]
机构
[1] Univ Wisconsin, Madison, WI 53706 USA
关键词
Design; Experimentation; Measurement; Performance; Reliability; Deterministic Multithreading; Precise Exceptions; ROLLBACK-RECOVERY; COST; SAFE;
D O I
10.1145/2666356.2594306
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Emerging trends in computer design and use are likely to make exceptions, once rare, the norm, especially as the system size grows. Due to exceptions, arising from hardware faults, approximate computing, dynamic resource management, etc., successful and error-free execution of programs may no longer be assured. Yet, designers will want to tolerate the exceptions so that the programs execute completely, efficiently and without external intervention. Modern computers easily handle exceptions in sequential programs, using precise interrupts. But they are ill-equipped to handle exceptions in parallel programs, which are growing in prevalence. In this work we introduce the notion of globally precise-restartable execution of parallel programs, analogous to precise-interruptible execution of sequential programs. We present a software runtime recovery system based on the approach to handle exceptions in suitably-written parallel programs. Qualitative and quantitative analyses show that the proposed system scales with the system size, especially when exceptions are frequent, unlike the conventional checkpoint-and-recovery method.
引用
收藏
页码:181 / 192
页数:12
相关论文
共 50 条
  • [41] Data-Centric Execution of Speculative Parallel Programs
    Jeffrey, Mark C.
    Subramanian, Suvinay
    Abeydeera, Maleen
    Emer, Joel
    Sanchez, Daniel
    [J]. 2016 49TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2016,
  • [42] USING TRUE CONCURRENCY TO MODEL EXECUTION OF PARALLEL PROGRAMS
    BENASHER, Y
    FARCHI, E
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 1994, 22 (04) : 375 - 407
  • [43] Global states monitoring in execution control of parallel programs
    Borkowski, J.
    Tudruj, M.
    [J]. PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING, 2008, : 419 - 423
  • [44] ALGEBRA OF ALGORITHMS AND DYNAMIC PARALLEL EXECUTION OF SEQUENTIAL PROGRAMS
    GLUSHKOV, VM
    KAPITONOVA, YV
    LETICHEVSKII, AA
    [J]. CYBERNETICS, 1982, 18 (05): : 533 - 542
  • [45] An Approach for Energy Efficient Execution of Hybrid Parallel Programs
    Ramapantulu, Lavanya
    Loghin, Dumitrel
    Teo, Yong Meng
    [J]. 2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2015, : 1000 - 1009
  • [46] Parampl: A Simple Approach for Parallel Execution of AMPL Programs
    Olszak, Artur
    Karbowski, Andrzej
    [J]. PARALLEL PROCESSING AND APPLIED MATHEMATICS (PPAM 2013), PT II, 2014, 8385 : 86 - 94
  • [47] EFFICIENT AND CORRECT EXECUTION OF PARALLEL PROGRAMS THAT SHARE MEMORY
    SHASHA, D
    SNIR, M
    [J]. ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 1988, 10 (02): : 282 - 312
  • [48] Scheduling Strategies for Optimistic Parallel Execution of Irregular Programs
    Kulkarni, Milind
    Carribault, Patrick
    Pingali, Keshav
    Ramanarayanan, Ganesh
    Walter, Bruce
    Bala, Kavita
    Chew, L. Paul
    [J]. SPAA'08: PROCEEDINGS OF THE TWENTIETH ANNUAL SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, 2008, : 217 - +
  • [49] Efficient execution of nondeterministic parallel programs on asynchronous systems
    Aumann, Y
    Bender, MA
    Zhang, L
    [J]. INFORMATION AND COMPUTATION, 1997, 139 (01) : 1 - 16
  • [50] TRANSFORMING RECURSIVE-PROGRAMS FOR EXECUTION ON PARALLEL MACHINES
    BUSH, VJ
    GURD, JR
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1985, 201 : 350 - 367