Profiling, What-if Analysis, and Cost-based Optimization of MapReduce Programs

被引:0
|
作者
Herodotou, Herodotos [1 ]
Babu, Shivnath [1 ]
机构
[1] Duke Univ, Durham, NC 27706 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2011年 / 4卷 / 11期
关键词
Compendex;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
MapReduce has emerged as a viable competitor to database systems in big data analytics. MapReduce programs are being written for a wide variety of application domains including business data processing, text analysis, natural language processing, Web graph and social network analysis, and computational science. However, MapReduce systems lack a feature that has been key to the historical success of database systems, namely, cost-based optimization. A major challenge here is that, to the MapReduce system, a program consists of black-box map and reduce functions written in some programming language like C++, Java, Python, or Ruby. We introduce, to our knowledge, the first Cost-based Optimizer for simple to arbitrarily complex MapReduce programs. We focus on the optimization opportunities presented by the large space of configuration parameters for these programs. We also introduce a Profiler to collect detailed statistical information from unmodified MapReduce programs, and a What-if Engine for fine-grained cost estimation. All components have been prototyped for the popular Hadoop MapReduce system. The effectiveness of each component is demonstrated through a comprehensive evaluation using representative MapReduce programs from various application domains.
引用
收藏
页码:1111 / 1122
页数:12
相关论文
共 50 条
  • [1] MapReduce Programming and Cost-based Optimization? Crossing this Chasm with Starfish
    Herodotou, Herodotos
    Dong, Fei
    Babu, Shivnath
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2011, 4 (12): : 1446 - 1449
  • [2] Compositional specification and analysis of cost-based properties in probabilistic programs
    Celiku, O
    McIver, A
    [J]. FM 2005: FORMAL METHODS, PROCEEDINGS, 2005, 3582 : 107 - 122
  • [3] Liner shipping cycle cost modelling, fleet deployment optimization and what-if analysis
    Panayotis G Zacharioudakis
    Stylianos Iordanis
    Dimitrios V Lyridis
    Harilaos N Psaraftis
    [J]. Maritime Economics & Logistics, 2011, 13 : 278 - 297
  • [4] Liner shipping cycle cost modelling, fleet deployment optimization and what-if analysis
    Zacharioudakis, Panayotis G.
    Iordanis, Stylianos
    Lyridis, Dimitrios V.
    Psaraftis, Harilaos N.
    [J]. MARITIME ECONOMICS & LOGISTICS, 2011, 13 (03) : 278 - 297
  • [5] Cost-based, integrated design optimization
    Azhar Iqbal
    Jorn S. Hansen
    [J]. Structural and Multidisciplinary Optimization, 2006, 32 : 447 - 461
  • [6] Cost-based Query Optimization for XPath
    Li, Dong
    Chen, Wenhao
    Liang, Xiaochong
    Guan, Jida
    Xu, Yang
    Lu, Xiuyu
    [J]. APPLIED MATHEMATICS & INFORMATION SCIENCES, 2014, 8 (04): : 1935 - 1948
  • [7] Rapid optimization of cost-based tolerancing
    Yabe, Akira
    [J]. APPLIED OPTICS, 2012, 51 (07) : 855 - 860
  • [8] Visualizing Cost-Based XQuery Optimization
    Weiner, Andreas M.
    Haerder, Theo
    da Silva, Renato Oliveira
    [J]. 26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING ICDE 2010, 2010, : 1165 - 1168
  • [9] Cost-based, integrated design optimization
    Iqbal, Azhar
    Hansen, Jorn S.
    [J]. STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION, 2006, 32 (06) : 447 - 461
  • [10] Cost-Based Optimization of Service Compositions
    Leitner, Philipp
    Hummer, Waldemar
    Dustdar, Schahram
    [J]. IEEE TRANSACTIONS ON SERVICES COMPUTING, 2013, 6 (02) : 239 - 251