Inferencing on Edge Devices: A Time- and Space-aware Co-scheduling Approach

被引:0
|
作者
Pereira, Danny [1 ]
Ghose, Anirban [1 ]
Ghosh, Sumana [2 ]
Dey, Soumyajit [1 ]
机构
[1] Indian Inst Technol Kharagpur, Dept Comp Sci & Engn, Kharagpur 721302, W Bengal, India
[2] Indian Stat Inst, Elect & Commun Sci Unit, Kolkata 700108, W Bengal, India
关键词
Convolutional neural network; edge device; GPU; Satisfiability Modulo Theories;
D O I
10.1145/3576197
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Neural Network (NN)-based real-time inferencing tasks are often co-scheduled on GPGPU-style edge platforms. Existingworks advocate using different NNparameters for the same detection task in different environments. However, realizing such approaches remains challenging, given accelerator devices' limited on-chip memory capacity. As a solution, we propose a multi-pass, time- and space-aware scheduling infrastructure for embedded platforms with GPU accelerators. The framework manages the residency of NN parameters in the limited on-chip memory while simultaneously dispatching relevant compute operations. The mapping decisions for memory operations and compute operations to the underlying resources of the platform are first determined in an offline manner. For this, we proposed a constraint solver-assisted scheduler that optimizes for schedule makespan. This is followed by memory optimization passes, which take the memory budget into account and accordingly adjust the start times of memory and compute operations. Our approach reports a 74%-90% savings in peak memory utilization with 0%-33% deadline misses for schedules that suffer miss percentage in ranges of 25%-100% when run using existing methods.
引用
收藏
页数:33
相关论文
共 50 条
  • [1] A Co-Scheduling Framework for DNN Models on Mobile and Edge Devices With Heterogeneous Hardware
    Xu, Zhiyuan
    Yang, Dejun
    Yin, Chengxiang
    Tang, Jian
    Wang, Yanzhi
    Xue, Guoliang
    [J]. IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (03) : 1275 - 1288
  • [2] LCA: A Memory Link and Cache-Aware Co-Scheduling Approach for CMPs
    Haritatos, Alexandros-Herodotos
    Goumas, Georgios
    Anastopoulos, Nikos
    Nikas, Konstantinos
    Kourtis, Kornilios
    Koziris, Nectarios
    [J]. PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'14), 2014, : 469 - 470
  • [3] Communication Aware Co-scheduling for Parallel Sob Scheduling in Cluster Computing
    Madheswari, A. Neela
    Banu, R. S. D. Wahida
    [J]. ADVANCES IN COMPUTING AND COMMUNICATIONS, PT 2, 2011, 191 : 545 - +
  • [4] Addressing characterization methods for memory contention aware co-scheduling
    de Blanche, Andreas
    Lundqvist, Thomas
    [J]. JOURNAL OF SUPERCOMPUTING, 2015, 71 (04): : 1451 - 1483
  • [5] Addressing characterization methods for memory contention aware co-scheduling
    Andreas de Blanche
    Thomas Lundqvist
    [J]. The Journal of Supercomputing, 2015, 71 : 1451 - 1483
  • [6] CPU-Accelerator Co-Scheduling for CNN Acceleration at the Edge
    Kim, Yeongmin
    Kong, Joonho
    Munir, Arslan
    [J]. IEEE ACCESS, 2020, 8 : 211422 - 211433
  • [7] Contention Aware Workload and Resource Co-Scheduling on Power-Bounded Systems
    Zou, Pengfei
    Feng, Xizhou
    Ge, Rong
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE AND STORAGE (NAS), 2019, : 9 - 16
  • [8] Improved Algorithms for Co-Scheduling of Edge Analytics and Routes for UAV Fleet Missions
    Khochare, Aakash
    Sorbelli, Francesco Betti
    Simmhan, Yogesh
    Das, Sajal K.
    [J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 2024, 32 (01) : 17 - 33
  • [9] Heuristic Algorithms for Co-scheduling of Edge Analytics and Routes for UAV Fleet Missions
    Khochare, Aakash
    Simmhan, Yogesh
    Sorbelli, Francesco Betti
    Das, Sajal K.
    [J]. IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2021), 2021,
  • [10] A Task Based Approach for Co-Scheduling Ensemble Workloads on Heterogeneous Nodes
    Kamatar, Alok
    Friese, Ryan
    Gioiosa, Roberto
    [J]. 2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW, 2023, : 5 - 15