Model-driven scheduling for distributed stream processing systems

被引:35
|
作者
Shukla, Anshu [1 ]
Simmhan, Yogesh [1 ]
机构
[1] Indian Inst Sci, Dept Computat & Data Sci, Bangalore, Karnataka, India
关键词
Stream processing; Scheduling algorithms; Performance models; Big data; Cloud computing; Distributed systems; INTERNET; FUTURE;
D O I
10.1016/j.jpdc.2018.02.003
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Distributed Stream Processing Systems (DSPS) are "Fast Data" platforms that allow streaming applications to be composed and executed with low latency on commodity clusters and Clouds. Such applications are composed as a Directed Acyclic Graph (DAG) of tasks, with data parallel execution using concurrent task threads on distributed resource slots. Scheduling such DAGs for DSPS has two parts-allocation of threads and resources for a DAG, and mapping threads to resources. Existing schedulers often address just one of these, make the assumption that performance linearly scales, or use ad hoc empirical tuning at runtime. Instead, we propose model-driven techniques for both mapping and allocation that rely on low-overhead a priori performance modeling of tasks. Our scheduling algorithms are able to offer predictable and low resource needs that is suitable for elastic pay-as-you-go Cloud resources, support a high input rate through high VM utilization, and can be combined with other mapping approaches as well. These are validated for micro and application benchmarks, and compared with contemporary schedulers, for the Apache Storm DSPS. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:98 / 114
页数:17
相关论文
共 50 条
  • [1] Model-driven distributed systems
    Coutts, IA
    Edwards, JM
    [J]. IEEE CONCURRENCY, 1997, 5 (03): : 55 - &
  • [2] Automatic model-driven recovery in distributed systems
    Joshi, KR
    Hiltunen, MA
    Sanders, WH
    Schlichting, RD
    [J]. 24TH IEEE SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 2005, : 25 - 36
  • [3] Probabilistic Model-Driven Recovery in Distributed Systems
    Joshi, Kaustubh R.
    Hiltunen, Matti A.
    Sanders, William H.
    Schlichting, Richard D.
    [J]. IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2011, 8 (06) : 913 - 928
  • [4] Poster: Iterative Scheduling for Distributed Stream Processing Systems
    Eskandari, Leila
    Mair, Jason
    Huang, Zhiyi
    Eyers, David
    [J]. DEBS'18: PROCEEDINGS OF THE 12TH ACM INTERNATIONAL CONFERENCE ON DISTRIBUTED AND EVENT-BASED SYSTEMS, 2018, : 234 - 237
  • [5] Dynamic Adaptation for Distributed Systems in Model-Driven Engineering
    Mohammed, Mufasir Muthaher
    [J]. ACM/IEEE 25TH INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS, MODELS 2022 COMPANION, 2022, : 146 - 151
  • [6] OpenPMF: A model-driven security framework for distributed systems
    Lang, U
    Schreiner, R
    [J]. ISSE 2004 - SECURING ELECTRONIC BUSINESS PROCESSES, 2004, : 138 - 147
  • [7] Online Scheduling for Shuffle Grouping in Distributed Stream Processing Systems
    Rivetti, Nicolo
    Anceaume, Emmanuelle
    Busnel, Yann
    Querzoni, Leonardo
    Sericola, Bruno
    [J]. MIDDLEWARE '16: PROCEEDINGS OF THE 17TH INTERNATIONAL MIDDLEWARE CONFERENCE, 2016,
  • [8] Model-based Scheduling for Stream Processing Systems
    Wang, Yidan
    Tari, Zahir
    HoseinyFarahabady, M. Reza
    Zomaya, Albert Y.
    [J]. 2017 19TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS (HPCC) / 2017 15TH IEEE INTERNATIONAL CONFERENCE ON SMART CITY (SMARTCITY) / 2017 3RD IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (DSS), 2017, : 215 - 222
  • [9] Model-driven engineering of middleware-mediated distributed systems
    Silaghi, R
    Strohmeier, A
    [J]. UML MODELING LANGUAGES AND APPLICATIONS, 2005, 3297 : 259 - 263
  • [10] A Model-Driven Approach to Enable the Distributed Simulation of Complex Systems
    Bocciarelli, Paolo
    D'Ambrogio, Andrea
    Falcone, Alberto
    Garro, Alfredo
    Giglio, Andrea
    [J]. COMPLEX SYSTEMS DESIGN & MANAGEMENT (CSD&M 2015), 2016, : 171 - 183