Tolerating Correlated Failures in Massively Parallel Stream Processing Engines

被引:0
|
作者
Su, Li [1 ]
Zhou, Yongluan [1 ]
机构
[1] Univ Southern Denmark, Odense, Denmark
来源
2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE) | 2016年
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Fault-tolerance techniques for stream processing engines can be categorized into passive and active approaches. A typical passive approach periodically checkpoints a processing task's runtime states and can recover a failed task by restoring its runtime state using its latest checkpoint. On the other hand, an active approach usually employs backup nodes to run replicated tasks. Upon failure, the active replica can take over the processing of the failed task with minimal latency. However, both approaches have their own inadequacies in Massively Parallel Stream Processing Engines (MPSPE). The passive approach incurs a long recovery latency especially when a number of correlated nodes fail simultaneously, while the active approach requires extra replication resources. In this paper, we propose a new fault-tolerance framework, which is Passive and Partially Active (PPA). In a PPA scheme, the passive approach is applied to all tasks while only a selected set of tasks will be actively replicated. The number of actively replicated tasks depends on the available resources. If tasks without active replicas fail, tentative outputs will be generated before the completion of the recovery process. We also propose effective and efficient algorithms to optimize a partially active replication plan to maximize the quality of tentative outputs. We implemented PPA on top of Storm, an open-source MPSPE and conducted extensive experiments using both real and synthetic datasets to verify the effectiveness of our approach.
引用
收藏
页码:517 / 528
页数:12
相关论文
共 50 条
  • [41] Massively parallel processing for fast and accurate stamping simulations
    Gress, JJ
    Xu, SG
    Joshi, R
    Wang, CT
    Paul, S
    Numisheet 2005: Proceedings of the 6th International Conference and Workshop on Numerical Simulation of 3D Sheet Metal Forming Processes, Pts A and B, 2005, 778 : 152 - 157
  • [42] A modular massively parallel processor for volumetric visualisation processing
    Krikelis, A
    HIGH PERFORMANCE COMPUTING FOR COMPUTER GRAPHICS AND VISUALISATION, 1996, : 101 - &
  • [43] Massively parallel processing implementation of the toroidal neural networks
    Palazzari, P
    Coli, M
    Rughi, R
    PROCEEDINGS OF THE 2000 6TH IEEE INTERNATIONAL WORKSHOP ON CELLULAR NEURAL NETWORKS AND THEIR APPLICATIONS (CNNA 2000), 2000, : 295 - 300
  • [44] Big Data Normalization for Massively Parallel Processing Databases
    Golov, Nikolay
    Ronnback, Lars
    ADVANCES IN CONCEPTUAL MODELING, ER 2015 WORKSHOPS, 2015, 9382 : 154 - 163
  • [45] Efficient identification of correlated neurons within massively parallel spike trains
    Berger, Denise
    Borgelt, Christian
    Morrison, Abigail
    Gruen, Sonja
    NEUROSCIENCE RESEARCH, 2009, 65 : S133 - S133
  • [46] Autonomic Parallel Data Stream Processing
    De Matteis, Tiziano
    2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2014, : 995 - 998
  • [47] How to Measure Scalability of Distributed Stream Processing Engines?
    Henning, Soeren
    Hasselbring, Wilhelm
    COMPANION OF THE ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING, ICPE 2021, 2021, : 85 - 88
  • [48] TabulaROSA: Tabular Operating System Architecture for Massively Parallel Heterogeneous Compute Engines
    Kepner, Jeremy
    Brightwell, Ron
    Edelman, Alan
    Gadepally, Vijay
    Jananthan, Hayden
    Jones, Michael
    Madden, Sam
    Michaleas, Peter
    Okhravi, Hamed
    Pedretti, Kevin
    Reuther, Albert
    Sterling, Thomas
    Stonebraker, Mike
    2018 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2018,
  • [49] Studying the Energy Consumption of Stream Processing Engines in the Cloud
    Govind, K. P.
    Pierre, Guillaume
    Rouvoy, Romain
    2023 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING, IC2E, 2023, : 99 - 106
  • [50] A Backpressure Mitigation Scheme in Distributed Stream Processing Engines
    Hanif, Muhammad
    Yoon, Hyeongdeok
    Lee, Choonhwa
    2020 34TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2020), 2020, : 713 - 716