Inferencing on Edge Devices: A Time- and Space-aware Co-scheduling Approach

被引：0

作者：

Pereira, Danny ^{[1
]}

Ghose, Anirban ^{[1
]}

Ghosh, Sumana ^{[2
]}

Dey, Soumyajit ^{[1
]}

机构：

[1] Indian Inst Technol Kharagpur, Dept Comp Sci & Engn, Kharagpur 721302, W Bengal, India

[2] Indian Stat Inst, Elect & Commun Sci Unit, Kolkata 700108, W Bengal, India

来源：

ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS | 2023年 / 28卷 / 03期

关键词：

Convolutional neural network; edge device; GPU; Satisfiability Modulo Theories;

D O I：

10.1145/3576197

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Neural Network (NN)-based real-time inferencing tasks are often co-scheduled on GPGPU-style edge platforms. Existingworks advocate using different NNparameters for the same detection task in different environments. However, realizing such approaches remains challenging, given accelerator devices' limited on-chip memory capacity. As a solution, we propose a multi-pass, time- and space-aware scheduling infrastructure for embedded platforms with GPU accelerators. The framework manages the residency of NN parameters in the limited on-chip memory while simultaneously dispatching relevant compute operations. The mapping decisions for memory operations and compute operations to the underlying resources of the platform are first determined in an offline manner. For this, we proposed a constraint solver-assisted scheduler that optimizes for schedule makespan. This is followed by memory optimization passes, which take the memory budget into account and accordingly adjust the start times of memory and compute operations. Our approach reports a 74%-90% savings in peak memory utilization with 0%-33% deadline misses for schedules that suffer miss percentage in ranges of 25%-100% when run using existing methods.

引用

页数：33

共 50 条

[1] A Co-Scheduling Framework for DNN Models on Mobile and Edge Devices With Heterogeneous Hardware
Xu, Zhiyuan
Yang, Dejun
Yin, Chengxiang
Tang, Jian
Wang, Yanzhi
Xue, Guoliang
[J]. IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (03) : 1275 - 1288
[2] LCA: A Memory Link and Cache-Aware Co-Scheduling Approach for CMPs
Haritatos, Alexandros-Herodotos
Goumas, Georgios
Anastopoulos, Nikos
Nikas, Konstantinos
Kourtis, Kornilios
Koziris, Nectarios
[J]. PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'14), 2014, : 469 - 470
[3] Communication Aware Co-scheduling for Parallel Sob Scheduling in Cluster Computing
Madheswari, A. Neela
Banu, R. S. D. Wahida
[J]. ADVANCES IN COMPUTING AND COMMUNICATIONS, PT 2, 2011, 191 : 545 - +
[4] Addressing characterization methods for memory contention aware co-scheduling
de Blanche, Andreas
Lundqvist, Thomas
[J]. JOURNAL OF SUPERCOMPUTING, 2015, 71 (04): : 1451 - 1483
[5] Addressing characterization methods for memory contention aware co-scheduling
Andreas de Blanche
Thomas Lundqvist
[J]. The Journal of Supercomputing, 2015, 71 : 1451 - 1483
[6] CPU-Accelerator Co-Scheduling for CNN Acceleration at the Edge
Kim, Yeongmin
Kong, Joonho
Munir, Arslan
[J]. IEEE ACCESS, 2020, 8 : 211422 - 211433
[7] Contention Aware Workload and Resource Co-Scheduling on Power-Bounded Systems
Zou, Pengfei
Feng, Xizhou
Ge, Rong
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE AND STORAGE (NAS), 2019, : 9 - 16
[8] Improved Algorithms for Co-Scheduling of Edge Analytics and Routes for UAV Fleet Missions
Khochare, Aakash
Sorbelli, Francesco Betti
Simmhan, Yogesh
Das, Sajal K.
[J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 2024, 32 (01) : 17 - 33
[9] Heuristic Algorithms for Co-scheduling of Edge Analytics and Routes for UAV Fleet Missions
Khochare, Aakash
Simmhan, Yogesh
Sorbelli, Francesco Betti
Das, Sajal K.
[J]. IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2021), 2021,
[10] A Task Based Approach for Co-Scheduling Ensemble Workloads on Heterogeneous Nodes
Kamatar, Alok
Friese, Ryan
Gioiosa, Roberto
[J]. 2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW, 2023, : 5 - 15

← 1 2 3 4 5 →