MARS: Exploiting Multi-Level Parallelism for DNN Workloads on Adaptive Multi-Accelerator Systems

被引:0
|
作者
Shen, Guan [1 ]
Zhao, Jieru [1 ]
Wang, Zeke [2 ]
Lin, Zhe [3 ]
Ding, Wenchao [4 ]
Wu, Chentao [1 ]
Chen, Quan [1 ]
Guo, Minyi [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Zhejiang Univ, Hangzhou, Peoples R China
[3] Sun Yat Sen Univ, Guangzhou, Peoples R China
[4] Fudan Univ, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/DAC56929.2023.10247992
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Along with the fast evolution of deep neural networks, the hardware system is also developing rapidly. As a promising solution achieving high scalability and low manufacturing cost, multi-accelerator systems widely exist in data centers, cloud platforms, and SoCs. Thus, a challenging problem arises in multi-accelerator systems: selecting a proper combination of accelerators from available designs and searching for efficient DNN mapping strategies. To this end, we propose MARS, a novel mapping framework that can perform computation-aware accelerator selection, and apply communication-aware sharding strategies to maximize parallelism. Experimental results show that MARS can achieve 32.2% latency reduction on average for typical DNN workloads compared to the baseline, and 59.4% latency reduction on heterogeneous models compared to the corresponding state-of-the-art method.
引用
下载
收藏
页数:6
相关论文
共 50 条
  • [31] Trireme: Exploration of Hierarchical Multi-level Parallelism for Hardware Acceleration
    Zacharopoulos, Georgios
    Ejjeh, Adel
    Jing, Ying
    Yang, En-Yu
    Jia, Tianyu
    Brumar, Iulian
    Intan, Jeremy
    Huzaifa, Muhammad
    Adve, Sarita
    Adve, Vikram
    Wei, Gu-Yeon
    Brooks, David
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2023, 22 (03)
  • [32] Multi-level parallelism for incompressible flow computations on GPU clusters
    Jacobsen, Dana A.
    Senocak, Inanc
    PARALLEL COMPUTING, 2013, 39 (01) : 1 - 20
  • [33] Scalable State Space Search on the GPU with Multi-Level Parallelism
    Shipovalov, Egor
    Pryanichnikov, Valentin
    2020 19TH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING (ISPDC 2020), 2020, : 84 - 92
  • [34] MARS: A multi-level array representation for simulation data
    Kim, Minsoo
    Suh, Ilhyun
    Chung, Yon Dohn
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 111 : 419 - 434
  • [35] Multi-level parallelism in the block-Jacobi SVD algorithm
    Oksa, G
    Vajtersic, M
    NINTH EUROMICRO WORKSHOP ON PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS, 2001, : 306 - 313
  • [36] Load balancing multi-zone applications on a heterogeneous cluster with multi-level parallelism
    Wong, P
    Jin, HQ
    Becker, J
    ISPDC 2004: THIRD INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING/HETEROPAR '04: THIRD INTERNATIONAL WORKSHOP ON ALGORITHMS, MODELS AND TOOLS FOR PARALLEL COMPUTING ON HETEROGENEOUS NETWORKS, PROCEEDINGS, 2004, : 388 - 393
  • [37] Adaptive multi-teacher multi-level knowledge distillation
    Liu, Yuang
    Zhang, Wei
    Wang, Jun
    NEUROCOMPUTING, 2020, 415 : 106 - 113
  • [38] Adaptive multi-teacher multi-level knowledge distillation
    Liu, Yuang
    Zhang, Wei
    Wang, Jun
    Neurocomputing, 2021, 415 : 106 - 113
  • [39] Exploiting the thread-level parallelism for BGP on Multi-core
    Gao Lei
    Lai Mingche
    Gong Zhenghu
    CNSR 2008: PROCEEDINGS OF THE 6TH ANNUAL COMMUNICATION NETWORKS AND SERVICES RESEARCH CONFERENCE, 2008, : 510 - 516
  • [40] A Multi-Factor Adaptive Multi-Level Cooperative Replacement Policy in Block Storage Systems
    Zhou, Yang
    Wang, Fang
    Shi, Zhan
    Feng, Dan
    2022 IEEE 40TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2022), 2022, : 67 - 75