Fault-Tolerant and Elastic Streaming MapReduce with Decentralized Coordination

被引:16
|
作者
Kumbhare, Alok [1 ]
Frincu, Marc [1 ]
Simmhan, Yogesh [2 ]
Prasanna, Viktor K. [1 ]
机构
[1] Univ Southern Calif, Los Angeles, CA 90089 USA
[2] Indian Inst Sci, Bangalore 560012, Karnataka, India
来源
2015 IEEE 35TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS | 2015年
关键词
Distributed stream processing; Streaming mapreduce; Runtime elasticity; Fault-tolerance; Big data;
D O I
10.1109/ICDCS.2015.41
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The MapReduce programming model, due to its simplicity and scalability, has become an essential tool for processing large data volumes in distributed environments. Recent Stream Processing Systems (SPS) extend this model to provide low-latency analysis of high-velocity continuous data streams. However, integrating MapReduce with streaming poses challenges: first, the runtime variations in data characteristics such as data-rates and key-distribution cause resource overload, that in-turn leads to fluctuations in the Quality of the Service (QoS); and second, the stateful reducers, whose state depends on the complete tuple history, necessitates efficient fault-recovery mechanisms to maintain the desired QoS in the presence of resource failures. We propose an integrated streaming MapReduce architecture leveraging the concept of consistent hashing to support runtime elasticity along with locality-aware data and state replication to provide efficient load-balancing with low-overhead fault-tolerance and parallel fault-recovery from multiple simultaneous failures. Our evaluation on a private cloud shows up to 2.8x improvement in peak throughput compared to Apache Storm SPS, and a low recovery latency of 700 - 1500 ms from multiple failures.
引用
收藏
页码:328 / 338
页数:11
相关论文
共 50 条
  • [1] A MapReduce system with fault-tolerant mechanism
    Shi, Yi
    Geng, Chen
    Qi, Yong
    Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2014, 48 (02): : 1 - 7
  • [2] On the Performance of Byzantine Fault-Tolerant MapReduce
    Costa, Pedro
    Pasin, Marcelo
    Bessani, Alysson Neves
    Correia, Miguel P.
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2013, 10 (05) : 301 - 313
  • [3] Medusa: An Efficient Cloud Fault-Tolerant MapReduce
    Costa, Pedro A. R. S.
    Bai, Xiao
    Ramos, Fernando M. V.
    Correia, Miguel
    2016 16TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2016, : 443 - 452
  • [4] Decentralized Fault-Tolerant Event Correlation
    Wilkin, Gregory Aaron
    Eugster, Patrick
    Jayaram, K. R.
    ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2014, 14 (01)
  • [5] FAULT-TOLERANT DECENTRALIZED COMMIT PROTOCOLS
    YUAN, SM
    AGRAWALA, AK
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1991, 13 (03) : 299 - 311
  • [6] Fault-Tolerant Streaming Computation with BlockMon
    Alfaia, Eduardo Costa
    Dusi, Maurizio
    Fiori, Luca
    Gringoli, Francesco
    Niccolini, Saverio
    2015 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2015,
  • [7] Network analysis of decentralized fault-tolerant UAV swarm coordination in critical missions
    Chandran, Indu
    Vipin, Kizheppatt
    DRONE SYSTEMS AND APPLICATIONS, 2024, 12 : 1 - 15
  • [8] Discretized Streams: Fault-Tolerant Streaming Computation at
    Zaharia, Matei
    Das, Tathagata
    Li, Haoyuan
    Hunter, Timothy
    Shenker, Scott
    Stoica, Ion
    SOSP'13: PROCEEDINGS OF THE TWENTY-FOURTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, 2013, : 423 - 438
  • [9] Evergreen: A Fault-tolerant Application Streaming Technique
    Kim, Won-Young
    Choi, Ji-hoon
    Shim, Jeong-Min
    Choi, Wan
    11TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY, VOLS I-III, PROCEEDINGS,: UBIQUITOUS ICT CONVERGENCE MAKES LIFE BETTER!, 2009, : 2302 - 2307
  • [10] AN EFFICIENT FAULT-TOLERANT DECENTRALIZED COMMIT PROTOCOL
    YUAN, SM
    PARALLEL COMPUTING, 1994, 20 (01) : 101 - 114