MARS: Exploiting Multi-Level Parallelism for DNN Workloads on Adaptive Multi-Accelerator Systems

被引：0

作者：

Shen, Guan ^{[1
]}

Zhao, Jieru ^{[1
]}

Wang, Zeke ^{[2
]}

Lin, Zhe ^{[3
]}

Ding, Wenchao ^{[4
]}

Wu, Chentao ^{[1
]}

Chen, Quan ^{[1
]}

Guo, Minyi ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China

[2] Zhejiang Univ, Hangzhou, Peoples R China

[3] Sun Yat Sen Univ, Guangzhou, Peoples R China

[4] Fudan Univ, Shanghai, Peoples R China

来源：

2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC | 2023年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/DAC56929.2023.10247992

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Along with the fast evolution of deep neural networks, the hardware system is also developing rapidly. As a promising solution achieving high scalability and low manufacturing cost, multi-accelerator systems widely exist in data centers, cloud platforms, and SoCs. Thus, a challenging problem arises in multi-accelerator systems: selecting a proper combination of accelerators from available designs and searching for efficient DNN mapping strategies. To this end, we propose MARS, a novel mapping framework that can perform computation-aware accelerator selection, and apply communication-aware sharding strategies to maximize parallelism. Experimental results show that MARS can achieve 32.2% latency reduction on average for typical DNN workloads compared to the baseline, and 59.4% latency reduction on heterogeneous models compared to the corresponding state-of-the-art method.

引用

下载

页数：6

共 50 条

[31] Trireme: Exploration of Hierarchical Multi-level Parallelism for Hardware Acceleration
Zacharopoulos, Georgios
Ejjeh, Adel
Jing, Ying
Yang, En-Yu
Jia, Tianyu
Brumar, Iulian
Intan, Jeremy
Huzaifa, Muhammad
Adve, Sarita
Adve, Vikram
Wei, Gu-Yeon
Brooks, David
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2023, 22 (03)
[32] Multi-level parallelism for incompressible flow computations on GPU clusters
Jacobsen, Dana A.
Senocak, Inanc
PARALLEL COMPUTING, 2013, 39 (01) : 1 - 20
[33] Scalable State Space Search on the GPU with Multi-Level Parallelism
Shipovalov, Egor
Pryanichnikov, Valentin
2020 19TH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING (ISPDC 2020), 2020, : 84 - 92
[34] MARS: A multi-level array representation for simulation data
Kim, Minsoo
Suh, Ilhyun
Chung, Yon Dohn
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 111 : 419 - 434
[35] Multi-level parallelism in the block-Jacobi SVD algorithm
Oksa, G
Vajtersic, M
NINTH EUROMICRO WORKSHOP ON PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS, 2001, : 306 - 313
[36] Load balancing multi-zone applications on a heterogeneous cluster with multi-level parallelism
Wong, P
Jin, HQ
Becker, J
ISPDC 2004: THIRD INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING/HETEROPAR '04: THIRD INTERNATIONAL WORKSHOP ON ALGORITHMS, MODELS AND TOOLS FOR PARALLEL COMPUTING ON HETEROGENEOUS NETWORKS, PROCEEDINGS, 2004, : 388 - 393
[37] Adaptive multi-teacher multi-level knowledge distillation
Liu, Yuang
Zhang, Wei
Wang, Jun
NEUROCOMPUTING, 2020, 415 : 106 - 113
[38] Adaptive multi-teacher multi-level knowledge distillation
Liu, Yuang
Zhang, Wei
Wang, Jun
Neurocomputing, 2021, 415 : 106 - 113
[39] Exploiting the thread-level parallelism for BGP on Multi-core
Gao Lei
Lai Mingche
Gong Zhenghu
CNSR 2008: PROCEEDINGS OF THE 6TH ANNUAL COMMUNICATION NETWORKS AND SERVICES RESEARCH CONFERENCE, 2008, : 510 - 516
[40] A Multi-Factor Adaptive Multi-Level Cooperative Replacement Policy in Block Storage Systems
Zhou, Yang
Wang, Fang
Shi, Zhan
Feng, Dan
2022 IEEE 40TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2022), 2022, : 67 - 75

← 1 2 3 4 5 →