Fault-Tolerant Parallel Execution of Workflows with Deadlines

被引:1
|
作者
Eitschberger, Patrick [1 ]
Keller, Joerg [1 ]
机构
[1] Fernuniv, Fac Math & Comp Sci, D-58084 Hagen, Germany
来源
2017 25TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2017) | 2017年
关键词
D O I
10.1109/PDP.2017.30
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Workflows of dependent tasks are a widespread model for parallel applications, often statically scheduled prior to application. Static schedules can tolerate processor failures due to permanent faults by placing duplicate tasks during the scheduling process. Schedules for workflows with deadlines can be extended to include frequency scaling information to optimize energy consumption. Frequency scaling can also be used in case of a fault to minimize its effects on the schedule makespan, however for the price of additional energy consumption. We investigate the interplay between these two parameters and quantify the energy increase to be expected in case of a fault and a given makespan increase. This knowledge enables the user to inform the scheduler about the makespan increase that is tolerable in case of a fault, where tolerable includes both the related performance aspects and the expected increase in energy. To achieve this, we model small taskgraphs from a benchmark suite as integer linear programs and determine with the help of a solver energy-optimal schedules for the fault-free case and for all possible fault positions with several levels of makespan increase. We present averages and distribution depending on makespan increase for a processor with hypothetical power profile. Additionally, we present two heuristics to modify task frequency settings in case of a fault, to restrict the makespan increase to a given value. Comparison with optimal frequency settings from the benchmark suite indicate that the heuristics only incur a small energy overhead.
引用
收藏
页码:78 / 84
页数:7
相关论文
共 50 条
  • [11] FAULT-TOLERANT PARALLEL PROCESSOR
    HARPER, RE
    LALA, JH
    JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 1991, 14 (03) : 554 - 563
  • [12] Supporting nondeterministic execution in fault-tolerant systems
    Slye, JH
    Elnozahy, EN
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL SYMPOSIUM ON FAULT-TOLERANT COMPUTING, 1996, : 250 - 259
  • [13] Fault-Tolerant Execution of Collaborating Mobile Agents
    Park, Taesoon
    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 2003, E86-A (11) : 2897 - 2900
  • [14] Fault-tolerant execution of collaborating mobile agents
    Park, T
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2003, E86A (11): : 2897 - 2900
  • [15] Fault-Tolerant Scheduling for Scientific Workflows in Cloud Environments
    Vinay, K.
    Kumar, S. M. Dilip
    2017 7TH IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2017, : 150 - 155
  • [16] GUARANTEED TASK DEADLINES FOR FAULT-TOLERANT WORKLOADS WITH CONDITIONAL BRANCHES
    HUGUE, MCM
    STOTTS, PD
    REAL-TIME SYSTEMS, 1991, 3 (03) : 275 - 305
  • [17] Fault-Tolerant Parallel Integer Multiplication
    Nissim, Roy
    Schwartz, Oded
    Spiizer, Yuval
    PROCEEDINGS OF THE 36TH ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, SPAA 2024, 2024, : 207 - 218
  • [18] Classification and design of fault-tolerant parallel
    Du, Yunfei
    Tang, Yuhua
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2011, 39 (04): : 49 - 52
  • [19] FAULT-TOLERANT PARALLEL PROGRAMMING IN ARGUS
    BAL, HE
    CONCURRENCY-PRACTICE AND EXPERIENCE, 1992, 4 (01): : 37 - 55
  • [20] FAULT-TOLERANT SCHEMES FOR PARALLEL ARCHITECTURES
    LIVESEY, MJ
    OWCZARCZYK, J
    ELECTRONICS LETTERS, 1987, 23 (22) : 1206 - 1207