MapReduce Programming and Cost-based Optimization? Crossing this Chasm with Starfish

被引:0
|
作者
Herodotou, Herodotos [1 ]
Dong, Fei [1 ]
Babu, Shivnath [1 ]
机构
[1] Duke Univ, Durham, NC 27706 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2011年 / 4卷 / 12期
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
MapReduce has emerged as a viable competitor to database systems in big data analytics. MapReduce programs are being written for a wide variety of application domains including business data processing, text analysis, natural language processing, Web graph and social network analysis, and computational science. However, MapReduce systems lack a feature that has been key to the historical success of database systems, namely, cost-based optimization. A major challenge here is that, to the MapReduce system, a program consists of black-box map and reduce functions written in some programming language like C++, Java, Python, or Ruby. Starfish is a self-tuning system for big data analytics that includes, to our knowledge, the first Cost-based Optimizer for simple to arbitrarily complex MapReduce programs. Starfish also includes a Profiler to collect detailed statistical information from unmodified MapReduce programs, and a What-if Engine for fine-grained cost estimation. This demonstration will present the profiling, what-if analysis, and cost-based optimization of MapReduce programs in Starfish. We will show how (nonexpert) users can employ the Starfish Visualizer to (a) get a deep understanding of a MapReduce program's behavior during execution, (b) ask hypothetical questions on how the program's behavior will change when parameter settings, cluster resources, or input data properties change, and (c) ultimately optimize the program.
引用
收藏
页码:1446 / 1449
页数:4
相关论文
共 50 条
  • [1] Profiling, What-if Analysis, and Cost-based Optimization of MapReduce Programs
    Herodotou, Herodotos
    Babu, Shivnath
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2011, 4 (11): : 1111 - 1122
  • [2] Functional Programming - Crossing The Chasm?
    Thomas, Dave
    [J]. JOURNAL OF OBJECT TECHNOLOGY, 2009, 8 (05): : 45 - 48
  • [3] Cost-based Query Optimization for XPath
    Li, Dong
    Chen, Wenhao
    Liang, Xiaochong
    Guan, Jida
    Xu, Yang
    Lu, Xiuyu
    [J]. APPLIED MATHEMATICS & INFORMATION SCIENCES, 2014, 8 (04): : 1935 - 1948
  • [4] Cost-based, integrated design optimization
    Azhar Iqbal
    Jorn S. Hansen
    [J]. Structural and Multidisciplinary Optimization, 2006, 32 : 447 - 461
  • [5] Rapid optimization of cost-based tolerancing
    Yabe, Akira
    [J]. APPLIED OPTICS, 2012, 51 (07) : 855 - 860
  • [6] Visualizing Cost-Based XQuery Optimization
    Weiner, Andreas M.
    Haerder, Theo
    da Silva, Renato Oliveira
    [J]. 26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING ICDE 2010, 2010, : 1165 - 1168
  • [7] Cost-Based Optimization of Service Compositions
    Leitner, Philipp
    Hummer, Waldemar
    Dustdar, Schahram
    [J]. IEEE TRANSACTIONS ON SERVICES COMPUTING, 2013, 6 (02) : 239 - 251
  • [8] Cost-based, integrated design optimization
    Iqbal, Azhar
    Hansen, Jorn S.
    [J]. STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION, 2006, 32 (06) : 447 - 461
  • [9] Cost-Based Domain Filtering for Stochastic Constraint Programming
    Rossi, Roberto
    Tarim, S. Armagan
    Hnich, Brahim
    Prestwich, Steven
    [J]. PRINCIPLES AND PRACTICE OF CONSTRAINT PROGRAMMING, 2008, 5202 : 235 - +
  • [10] A Cost-based Optimizer for Gradient Descent Optimization
    Kaoudi, Zoi
    Quiane-Ruiz, Jorge-Arnulfo
    Thirumuruganathan, Saravanan
    Chawla, Sanjay
    Agrawal, Divy
    [J]. SIGMOD'17: PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2017, : 977 - 992