Evaluating End-to-End Optimization for Data Analytics Applications in Weld

被引:36
|
作者
Palkar, Shoumik [1 ]
Thomas, James [1 ]
Narayanan, Deepak [1 ]
Thaker, Pratiksha [1 ]
Palamuttam, Rahul [1 ]
Negi, Parimajan [1 ]
Shanbhag, Anil [3 ]
Schwarzkopf, Malte [3 ]
Pirk, Holger [2 ]
Amarasinghe, Saman [3 ]
Madden, Samuel [3 ]
Zaharia, Matei [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] Imperial Coll London, London, England
[3] MIT CSAIL, Cambridge, MA USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2018年 / 11卷 / 09期
关键词
QUERY; PERFORMANCE;
D O I
10.14778/3213880.3213890
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modern analytics applications use a diverse mix of libraries and functions. Unfortunately, there is no optimization across these libraries. resulting in performance penalties as high as an order of magnitude in many applications. To address this problem, we proposed Weld, a common runtime for existing data analytics libraries that performs key physical optimizations such as pipelining under existing, imperative library APIs. In this work, we further develop the Weld vision by designing an automatic adaptive optimizer for Weld applications, and evaluating its impact on realistic data science workloads. Our optimizer eliminates multiple forms of overhead that arise when composing imperative libraries like Pandas and NumPy, and uses lightweight measurements to make data-dependent decisions at runtime in ad-hoc workloads where no statistics are available, with sub-second overhead. We also evaluate which optimizations have the largest impact in practice and whether Weld can be integrated into libraries incrementally. Our results are promising: using our optimizer, Weld accelerates data science workloads by up to 23x on one thread and 80x on eight threads, and its adaptive optimizations provide up to a 3.75x speedup over rule-based optimization. Moreover, Weld provides benefits if even just 4-5 operators in a library are ported to use it. Our results show that common runtime designs like Weld may be a viable approach to accelerate analytics.
引用
收藏
页码:1002 / 1015
页数:14
相关论文
共 50 条
  • [1] End-to-End Optimization of Deep Learning Applications
    Sohrabizadeh, Atefeh
    Wang, Jie
    Cong, Jason
    [J]. 2020 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS (FPGA '20), 2020, : 133 - 139
  • [2] Multi-layer Optimizations for End-to-End Data Analytics
    Shaikhha, Amir
    Schleich, Maximilian
    Ghita, Alexandru
    Olteanu, Dan
    [J]. CGO'20: PROCEEDINGS OF THE18TH ACM/IEEE INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, 2020, : 145 - 157
  • [3] Building End-to-End Management Analytics for Enterprise Data Centers
    Huang, Hai
    Ruan, Yaoping
    Shaikh, Anees
    Routray, Ramani
    Tan, Chung-hao
    Gopisetty, Sandeep
    [J]. 2009 IFIP/IEEE INTERNATIONAL SYMPOSIUM ON INTEGRATED NETWORK MANAGEMENT (IM 2009) VOLS 1 AND 2, 2009, : 661 - 675
  • [4] End-to-End Data Analytics Framework for 5G Architecture
    Pateromichelakis, Emmanouil
    Moggio, Fabrizio
    Mannweiler, Christian
    Arnold, Paul
    Shariat, Mehrdad
    Einhaus, Michael
    Wei, Qing
    Bulakci, Omer
    De Domenico, Antonio
    [J]. IEEE ACCESS, 2019, 7 : 40295 - 40312
  • [5] Optimization of end-to-end service
    Shao, Bi-Lin
    Zhang, Zhi-Xia
    [J]. Xi'an Jianzhu Keji Daxue Xuebao/Journal of Xi'an University of Architecture and Technology, 2004, 36 (04):
  • [6] A hybrid framework for sequential data prediction with end-to-end optimization
    Aydin, Mustafa E.
    Kozat, Suleyman S.
    [J]. DIGITAL SIGNAL PROCESSING, 2022, 129
  • [7] A hybrid framework for sequential data prediction with end-to-end optimization
    Aydin, Mustafa E.
    Kozat, Suleyman S.
    [J]. DIGITAL SIGNAL PROCESSING, 2022, 129
  • [8] A Framework for Evaluating the End-to-End Trustworthiness
    Mohammadi, Nazila Gol
    Bandyszak, Torsten
    Weyer, Thorsten
    Kalogiros, Costas
    Kanakakis, Michalis
    [J]. 2015 IEEE TRUSTCOM/BIGDATASE/ISPA, VOL 1, 2015, : 638 - 645
  • [9] An end-to-end model-based approach to support big data analytics development
    Khalajzadeh, Hourieh
    Simmons, Andrew J.
    Abdelrazek, Mohamed
    Grundy, John
    Hosking, John
    He, Qiang
    [J]. JOURNAL OF COMPUTER LANGUAGES, 2020, 58
  • [10] EVA: An End-to-End Exploratory Video Analytics System
    Kakkar, Gaurav Tarlok
    Cao, Jiashen
    Chunduri, Pramod
    Xu, Zhuangdi
    Vyalla, Suryatej Reddy
    Dintyala, Prashanth
    Prabakaran, Anirudh
    Bang, Jaeho
    Sengupta, Aubhro
    Ravichandran, Kaushik
    Sivakumar, Ishwarya
    Rajoria, Aryan
    Raju, Ashmita
    Aggarwal, Tushar
    Shah, Abdullah
    Garg, Sanjana
    Suman, Shashank
    Kalluraya, Myna Prasanna
    Mitra, Subrata
    Payani, Ali
    Lu, Yao
    Ramachandran, Umakishore
    Arulraj, Joy
    [J]. PROCEEDINGS OF THE SEVENTH WORKSHOP ON DATA MANAGEMENT FOR END-TO-END MACHINE LEARNING, DEEM, 2023,