Parallel Scheduling of Data-Intensive Tasks

被引:0
|
作者
Meng, Xiao [1 ]
Golab, Lukasz [1 ]
机构
[1] Univ Waterloo, Waterloo, ON, Canada
来源
关键词
Parallel scheduling; Data-intensive tasks; Caching;
D O I
10.1007/978-3-030-57675-2_8
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Workloads with precedence constraints due to data dependencies are common in various applications. These workloads can be represented as directed acyclic graphs (DAG), and are often data-intensive, meaning that data loading cost is the dominant factor and thus cache misses should be minimized We address the problem of parallel scheduling of a DAG of data-intensive tasks to minimize makespan. To do so, we propose greedy online scheduling algorithms that take load balancing, data dependencies, and data locality into account. Simulations and an experimental evaluation using an Apache Spark cluster demonstrate the advantages of our solutions.
引用
收藏
页码:117 / 133
页数:17
相关论文
共 50 条
  • [1] Cooperative Job Scheduling and Data Allocation in Data-Intensive Parallel Computing Clusters
    Wang, Haoyu
    Liu, Guoxin
    Shen, Haiying
    [J]. IEEE TRANSACTIONS ON CLOUD COMPUTING, 2023, 11 (03) : 2392 - 2406
  • [2] Data-Intensive HPC Tasks Scheduling with SDN to Enable HPC-as-a-Service
    Jamalian, Saba
    Rajaei, Hassan
    [J]. 2015 IEEE 8TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, 2015, : 596 - 603
  • [3] Cooperative Job Scheduling and Data Allocation for Busy Data-Intensive Parallel Computing Clusters
    Liu, Guoxin
    Shen, Haiying
    Wang, Haoyu
    [J]. PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,
  • [4] Network-Adaptive Scheduling of Data-Intensive Parallel Jobs with Dependencies in Clusters
    Wang, Shaoqi
    Zhou, Xiaobo
    Zhang, Liqiang
    Jiang, Changjun
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC COMPUTING (ICAC), 2017, : 155 - 160
  • [5] Dependency-Aware Network Adaptive Scheduling of Data-Intensive Parallel Jobs
    Wang, Shaoqi
    Chen, Wei
    Zhou, Xiaobo
    Zhang, Liqiang
    Wang, Yin
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (03) : 515 - 529
  • [6] Parallel data-intensive algorithms and applications
    Talia, D
    Srimani, PK
    [J]. PARALLEL COMPUTING, 2002, 28 (05) : 669 - 671
  • [7] Fault Tolerant Parallel Data-Intensive Algorithms
    Kutlu, Mucahid
    Agrawal, Gagan
    Kurt, Oguz
    [J]. 2012 19TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2012,
  • [8] Parallel Framework for Data-Intensive Computing with XSEDE
    Subramanian, Ranjini
    Zhang, Hui
    [J]. PEARC '19: PROCEEDINGS OF THE PRACTICE AND EXPERIENCE IN ADVANCED RESEARCH COMPUTING ON RISE OF THE MACHINES (LEARNING), 2019,
  • [9] Accelerating Data-Intensive Applications: A Cloud Computing Approach to Parallel Image Pattern Recognition Tasks
    Han, Liangxiu
    Saengngam, Tantana
    van Hemert, Jano
    [J]. PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON ADVANCED ENGINEERING COMPUTING AND APPLICATIONS IN SCIENCES (ADVCOMP 2010), 2010, : 148 - 153
  • [10] Parallel Optimization for Data-Intensive Service Composition
    Deng, Shuiguang
    Huang, Longtao
    Wu, Bin
    Xiong, Lirong
    [J]. JOURNAL OF INTERNET TECHNOLOGY, 2013, 14 (05): : 817 - 824