Improving MapReduce Performance with Partial Speculative Execution

被引:21
|
作者
Wang, Yaoguang [1 ]
Lu, Weiming [1 ]
Lou, Renjie [1 ]
Wei, Baogang [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci, Hangzhou 310003, Zhejiang, Peoples R China
关键词
Speculative execution; MapReduce performance; Straggler mitigation; SCHEDULING ALGORITHM;
D O I
10.1007/s10723-015-9350-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The MapReduce framework has become the de facto standard for big data processing due to its attractive features and abilities. One is that it automatically parallelizes a job into multiple tasks and transparently handles task execution on a large cluster of commodity machines. The increasing heterogeneity of distributed environments may result in a few straggling tasks, which prolong job completion. Speculative execution is proposed to mitigate stragglers. However, the existing speculative execution mechanism could not work efficiently as many speculative tasks are still slower than their original tasks. In this paper, we explore an approach to increase the efficiency of speculative execution, and further improve MapReduce performance. We propose the Partial Speculative Execution (PSE) strategy to make speculative tasks start from the checkpoint. By leveraging the checkpoint of original tasks, PSE can eliminate the costs of re-reading, re-copying, and re-computing the processed data. We implement PSE in Hadoop, and evaluate its performance in terms of job completion time and the efficiency of speculative execution under several kinds of classical workloads. Experimental results show that, in heterogeneous environments with stragglers, PSE completes jobs 56 % faster than that with no speculation and 12 % faster than that with LATE, an improved speculative execution algorithm. In addition, on average PSE can improve the efficiency of speculative execution by 24 % compared to LATE.
引用
收藏
页码:587 / 604
页数:18
相关论文
共 50 条
  • [31] Improving MapReduce Performance by Balancing Skewed Loads
    Fan Yuanquan
    Wu Weiguo
    Xu Yunlong
    Chen Heng
    CHINA COMMUNICATIONS, 2014, 11 (08) : 85 - 108
  • [32] NPIY : A novel partitioner for improving mapreduce performance
    Lu, Wei
    Chen, Lei
    Wang, Liqiang
    Yuan, Haitao
    Xing, Weiwei
    Yang, Yong
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2018, 46 : 1 - 11
  • [33] A theory of nested speculative execution
    Tapus, Cristian
    Hickey, Jason
    COORDINATION MODELS AND LANGUAGES, PROCEEDINGS, 2007, 4467 : 151 - +
  • [34] Speculative Data-Oblivious Execution: Mobilizing Safe Prediction For Safe and Efficient Speculative Execution
    Yu, Jiyong
    Mantri, Namrata
    Torrellas, Josep
    Morrison, Adam
    Fletcher, Christopher W.
    2020 ACM/IEEE 47TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2020), 2020, : 707 - 720
  • [35] Dynamically Spawning Speculative Threads to Improve Speculative Path Execution
    Li, Meirong
    Zhao, Yinliang
    Tao, You
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2014, PT II, 2014, 8631 : 192 - 206
  • [36] Data Preloading and Data Placement for MapReduce Performance Improving
    Spivak, Anton
    Nasonov, Denis
    5TH INTERNATIONAL YOUNG SCIENTIST CONFERENCE ON COMPUTATIONAL SCIENCE, YSC 2016, 2016, 101 : 379 - 387
  • [37] Improving the performance of aggregate queries with cached tuples in mapReduce
    Peng, Dunlu
    Duan, Kai
    Xie, Lei
    International Journal of Database Theory and Application, 2013, 6 (01): : 13 - 24
  • [38] Improving MapReduce Performance in a Heterogeneous Cloud: A Measurement Study
    Zhao, Xu
    Liu, Ling
    Zhang, Qi
    Dong, Xiaoshe
    2014 IEEE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2014, : 401 - 408
  • [39] The impact of speculative execution on SMT processors
    Kang, Dongsoo
    Liu, Chen
    Gaudiot, Jean-Luc
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2008, 36 (04) : 361 - 385
  • [40] Value Prediction and Speculative Execution on GPU
    Liu, Shaoshan
    Eisenbeis, Christine
    Gaudiot, Jean-Luc
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2011, 39 (05) : 533 - 552