Locality-Aware Scheduling of Independent Tasks for Runtime Systems

被引:2
|
作者
Gonthier, Maxime [1 ,2 ]
Marchal, Loris [1 ,2 ]
Thibault, Samuel [3 ]
机构
[1] ENS Lyon, LIP, CNRS, INRIA, Lyon, France
[2] Univ Claude Bernard Lyon 1, Lyon, France
[3] Univ Bordeaux, CNRS, LaBRI, Inria Bordeaux Sud Ouest, Talence, France
关键词
Memory-aware scheduling; Eviction policy; Tasks sharing data; Runtime systems;
D O I
10.1007/978-3-031-06156-1_1
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A now-classical way of meeting the increasing demand for computing speed by HPC applications is the use of GPUs and/or other accelerators. Such accelerators have their own memory, which is usually quite limited, and are connected to the main memory through a bus with bounded bandwidth. Thus, particular care should be devoted to data locality in order to avoid unnecessary data movements. Task-based runtime schedulers have emerged as a convenient and efficient way to use such heterogeneous platforms. When processing an application, the scheduler has the knowledge of all tasks available for processing on a GPU, as well as their input data dependencies. Hence, it is able to order tasks and prefetch their input data in the GPU memory (after possibly evicting some previously-loaded data), while aiming at minimizing data movements, so as to reduce the total processing time. In this paper, we focus on how to schedule tasks that share some of their input data (but are otherwise independent) on a GPU. We provide a formal model of the problem, exhibit an optimal eviction strategy, and show that ordering tasks to minimize data movement is NP-complete. We review and adapt existing ordering strategies to this problem, and propose a new one based on task aggregation. These strategies have been implemented in the STARPU runtime system. We present their performance on tasks from tiled 2D and 3D matrix products. We present their performance on tasks from tiled 2D, 3D matrix products. Our experiments demonstrate that using our new strategy together with the optimal eviction policy reduces the amount of data movement as well as the total processing time.
引用
收藏
页码:5 / 16
页数:12
相关论文
共 50 条
  • [1] Locality-Aware Mapping and Scheduling for Multicores
    Ding, Wei
    Zhang, Yuanrui
    Kandemir, Mahmut
    Srinivas, Jithendra
    Yedlapalli, Praveen
    PROCEEDINGS OF THE 2013 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO), 2013, : 335 - 346
  • [2] Locality-aware task scheduling for homogeneous parallel computing systems
    Muhammad Khurram Bhatti
    Isil Oz
    Sarah Amin
    Maria Mushtaq
    Umer Farooq
    Konstantin Popov
    Mats Brorsson
    Computing, 2018, 100 : 557 - 595
  • [3] Locality-aware task scheduling for homogeneous parallel computing systems
    Bhatti, Muhammad Khurram
    Oz, Isil
    Amin, Sarah
    Mushtaq, Maria
    Farooq, Umer
    Popov, Konstantin
    Brorsson, Mats
    COMPUTING, 2018, 100 (06) : 557 - 595
  • [4] BOLAS: Bipartite-graph Oriented Locality-Aware Scheduling for MapReduce Tasks
    Xue, Ruini
    Gao, Shengli
    Ao, Lixiang
    Guan, Zhongyang
    2015 14TH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING (ISPDC), 2015, : 37 - 45
  • [5] Locality-aware predictive scheduling of network processors
    Wolf, T
    Franklin, MA
    ISPASS: 2001 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, 2001, : 152 - 159
  • [6] Locality-aware process scheduling for embedded MPSoCs
    Kandemir, M
    Chen, GL
    DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 870 - 875
  • [7] Locality-Aware Scheduling for Scalable Heterogeneous Environments
    Kamatar, Alok, V
    Friese, Ryan D.
    Gioiosa, Roberto
    PROCEEDINGS OF 2020 10TH IEEE/ACM INTERNATIONAL WORKSHOP ON RUNTIME AND OPERATING SYSTEMS FOR SUPERCOMPUTERS (ROSS 2020), 2020, : 50 - 58
  • [8] Locality-Aware Scheduling for Containers in Cloud Computing
    Babu, G. Charles
    Hanuman, A. Sai
    Kiran, J. Sasi
    Babu, B. Sankara
    INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES, ICICCT 2019, 2020, 89 : 177 - 185
  • [9] Locality-Aware CTA Scheduling for Gaming Applications
    Ukarande, Aditya
    Patidar, Suryakant
    Rangan, Ram
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2022, 19 (01)
  • [10] Locality-Aware Scheduling for Containers in Cloud Computing
    Zhao, Dongfang
    Mohamed, Mohamed
    Ludwig, Heiko
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2020, 8 (02) : 635 - 646