A Case for Abstract Cost Models for Distributed Execution of Analytics Operators

被引:0
|
作者
Li, Rundong [1 ]
Mi, Ningfang [2 ]
Riedewald, Mirek [1 ]
Sun, Yizhou [3 ]
Yao, Yi [2 ]
机构
[1] Northeastern Univ, CCIS, Boston, MA 02115 USA
[2] Northeastern Univ, ECE, Boston, MA 02115 USA
[3] UCLA, Dept Comp Sci, Los Angeles, CA 90024 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
MATRIX MULTIPLICATION; ALGORITHMS; MAPREDUCE;
D O I
10.1007/978-3-319-64283-3_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider data analytics workloads on distributed architectures, in particular clusters of commodity machines. To find a job partitioning that minimizes running time, a cost model, which we more accurately refer to as makespan model, is needed. In attempting to find the simplest possible, but sufficiently accurate, such model, we explore piecewise linear functions of input, output, and computational complexity. They are abstract in the sense that they capture fundamental algorithm properties, but do not require explicit modeling of system and implementation details such as the number of disk accesses. We show how the simplified functional structure can be exploited by directly integrating the model into the makespan optimization process, reducing complexity by orders of magnitude. Experimental results provide evidence of good prediction quality and successful makespan optimization across a variety of cluster architectures.
引用
收藏
页码:149 / 163
页数:15
相关论文
共 50 条
  • [1] Abstract cost models for distributed data-intensive computations
    Rundong Li
    Ningfang Mi
    Mirek Riedewald
    Yizhou Sun
    Yi Yao
    Distributed and Parallel Databases, 2019, 37 : 411 - 439
  • [2] Abstract cost models for distributed data-intensive computations
    Li, Rundong
    Mi, Ningfang
    Riedewald, Mirek
    Sun, Yizhou
    Yao, Yi
    DISTRIBUTED AND PARALLEL DATABASES, 2019, 37 (03) : 411 - 439
  • [3] Execution Models for Mobile Data Analytics
    Rehman, Muhammad Habib Ur
    Batool, Aisha
    Liew, Chee Sun
    Teh, Ying-Wah
    Khan, Atta Ur Rehman
    IT PROFESSIONAL, 2017, 19 (03) : 24 - 30
  • [4] QSpark: Distributed Execution of Batch & Streaming Analytics in Spark Platform
    HoseinyFarahabady, M. Reza
    Taheri, Javid
    Zomaya, Albert Y.
    Tari, Zahir
    2021 IEEE 20TH INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS (NCA), 2021,
  • [5] Refining the execution of abstract actions with learned action models
    Stulp, Freek
    Beetz, Michael
    Journal of Artificial Intelligence Research, 1600, 32 : 487 - 523
  • [6] Refining the execution of abstract actions with learned action models
    Stulp, Freek
    Beetz, Michael
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2008, 32 : 487 - 523
  • [7] Coach planning with opponent models for distributed execution
    Riley, Patrick F.
    Veloso, Manuela M.
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2006, 13 (03) : 293 - 325
  • [8] Optimistic distributed execution of business process models
    Ferscha, A
    PROCEEDINGS OF THE THIRTY-FIRST HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, VOL VII: SOFTWARE TECHNOLOGY TRACK, 1998, : 723 - 732
  • [9] Coach planning with opponent models for distributed execution
    Patrick F. Riley
    Manuela M. Veloso
    Autonomous Agents and Multi-Agent Systems, 2006, 13 : 293 - 325
  • [10] Injecting Abstract Interpretations into Linear Cost Models
    Cachera, David
    Jobin, Arnaud
    ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2010, (28): : 64 - 81