MARS: Exploiting Multi-Level Parallelism for DNN Workloads on Adaptive Multi-Accelerator Systems

被引:0
|
作者
Shen, Guan [1 ]
Zhao, Jieru [1 ]
Wang, Zeke [2 ]
Lin, Zhe [3 ]
Ding, Wenchao [4 ]
Wu, Chentao [1 ]
Chen, Quan [1 ]
Guo, Minyi [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Zhejiang Univ, Hangzhou, Peoples R China
[3] Sun Yat Sen Univ, Guangzhou, Peoples R China
[4] Fudan Univ, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/DAC56929.2023.10247992
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Along with the fast evolution of deep neural networks, the hardware system is also developing rapidly. As a promising solution achieving high scalability and low manufacturing cost, multi-accelerator systems widely exist in data centers, cloud platforms, and SoCs. Thus, a challenging problem arises in multi-accelerator systems: selecting a proper combination of accelerators from available designs and searching for efficient DNN mapping strategies. To this end, we propose MARS, a novel mapping framework that can perform computation-aware accelerator selection, and apply communication-aware sharding strategies to maximize parallelism. Experimental results show that MARS can achieve 32.2% latency reduction on average for typical DNN workloads compared to the baseline, and 59.4% latency reduction on heterogeneous models compared to the corresponding state-of-the-art method.
引用
下载
收藏
页数:6
相关论文
共 50 条
  • [1] Data-driven modeling of reconfigurable multi-accelerator systems under dynamic workloads
    Encinas, Juan
    Rodriguez, Alfonso
    Otero, Andres
    de la Torre, Eduardo
    MICROPROCESSORS AND MICROSYSTEMS, 2024, 107
  • [2] Exploiting Multi-Level Parallelism for Run-Time Adaptive Inverse Kinematics on Heterogeneous MPSoCs
    Suriano, Leonardo
    Otero, Andres
    Rodriguez, Alfonso
    Sanchez-Renedo, Manuel
    De la Torre, Eduardo
    IEEE ACCESS, 2020, 8 (08) : 118707 - 118724
  • [3] FDRA: A Framework for a Dynamically Reconfigurable Accelerator Supporting Multi-Level Parallelism
    Qiu, Yunhui
    Mao, Yiqing
    Gao, Xuchen
    Chen, Sichao
    Li, Jiangnan
    Yin, Wenbo
    Wang, Lingli
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2024, 17 (01)
  • [4] Exploiting Multi-Level Parallelism for Stitching Very Large Microscopy Images
    Bria, Alessandro
    Bernaschi, Massimo
    Guarrasi, Massimiliano
    Iannello, Giulio
    FRONTIERS IN NEUROINFORMATICS, 2019, 13
  • [5] On Exploiting Patterns For Robust FPGA-based Multi-accelerator Edge Computing Systems
    Razavi, Seyyed Ahmad
    Ting, Hsin-Yu
    Giyahchi, Thotiya
    Bozorgzadeh, Eli
    PROCEEDINGS OF THE 2022 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2022), 2022, : 116 - 119
  • [6] Exploiting multi-level parallelism for homology search using general purpose processors
    Meng, XD
    Chaudhary, V
    11TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS WORKSHOPS, VOL II, PROCEEDINGS,, 2005, : 331 - 335
  • [7] Towards Fair and Firm Real-Time Scheduling in DNN Multi-Tenant Multi-Accelerator Systems via Reinforcement Learning
    Russo, Enrico
    Blanco, Francesco Giulio
    Palesi, Maurizio
    Ascia, Giuseppe
    Patti, Davide
    Catania, Vincenzo
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [8] Precision-aware Latency and Energy Balancing on Multi-Accelerator Platforms for DNN Inference
    Risso, Matteo
    Burrello, Alessio
    Sarda, Giuseppe Maria
    Benini, Luca
    Macii, Enrico
    Poncino, Massimo
    Verhelst, Marian
    Pagliari, Daniele Jahier
    2023 IEEE/ACM INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, ISLPED, 2023,
  • [9] Heterogeneous Accelerator Design for Multi-DNN Workloads via Heuristic Optimization
    Balaskas, Konstantinos
    Khdr, Heba
    Bakr Sikal, Mohammed
    Kreß, Fabian
    Siozios, Kostas
    Becker, Jurgen
    Henkel, Jorg
    IEEE Embedded Systems Letters, 2024, 16 (04) : 317 - 320
  • [10] A packet scheduling algorithm for IPSec multi-accelerator based systems
    Castanier, F
    Ferrante, A
    Piuri, V
    15TH IEEE INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS, PROCEEDINGS, 2004, : 387 - 397