DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion

被引：67

作者：

Niu, Wei ^{[1
]}

Guan, Jiexiong ^{[1
]}

Wang, Yanzhi ^{[2
]}

Agrawal, Gagan ^{[3
]}

Ren, Bin ^{[1
]}

机构：

[1] William & Mary, Williamsburg, VA 23185 USA

[2] Northeastern Univ, Boston, MA 02115 USA

[3] Augusta Univ, Augusta, GA USA

来源：

PROCEEDINGS OF THE 42ND ACM SIGPLAN INTERNATIONAL CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '21) | 2021年

基金：

美国国家科学基金会;

关键词：

Compiler Optimization; Operator Fusion; Deep Neural Network; Mobile Devices; TRANSFORMATIONS; OPTIMIZATION; LOCALITY; LOOP;

D O I：

10.1145/3453483.3454083

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Deep Neural Networks (DNNs) have emerged as the core enabler of many major applications on mobile devices. To achieve high accuracy, DNN models have become increasingly deep with hundreds or even thousands of operator layers, leading to high memory and computational requirements for inference. Operator fusion (or kernel/layer fusion) is key optimization in many state-of-the-art DNN execution frameworks, such as TensorFlow, TVM, and MNN, that aim to improve the efficiency of the DNN inference. However, these frameworks usually adopt fusion approaches based on certain patterns that are too restrictive to cover the diversity of operators and layer connections, especially those seen in many extremely deep models. Polyhedral-based loop fusion techniques, on the other hand, work on a low-level view of the computation without operator-level information, and can also miss potential fusion opportunities. To address this challenge, this paper proposes a novel and extensive loop fusion framework called DNNFusion. The basic idea of this work is to work at an operator view of DNNs, but expand fusion opportunities by developing a classification of both individual operators and their combinations. In addition, DNNFusion includes 1) a novel mathematical-property-based graph rewriting framework to reduce evaluation costs and facilitate subsequent operator fusion, 2) an integrated fusion plan generation that leverages the high-level analysis and accurate light-weight profiling, and 3) additional optimizations during fusion code generation. DNNFusion is extensively evaluated on 15 DNN models with varied types of tasks, model sizes, and layer counts. The evaluation results demonstrate that DNNFusion finds up to 8.8x higher fusion opportunities, outperforms four state-of-the-art DNN execution frameworks with 9.3x speedup. The memory requirement reduction and speedups can enable the execution of many of the target models on mobile devices and even make them part of a real-time application.

引用

页码：883 / 898

页数：16

共 50 条

[21] Accelerating Spectral Normalization for Enhancing Robustness of Deep Neural Networks
Pan, Zhixin
Mishra, Prabhat
2021 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2021), 2021, : 260 - 265
[22] Medical image fusion with deep neural networks
Liang, Nannan
SCIENTIFIC REPORTS, 2024, 14 (01)
[23] Acorns: A Framework for Accelerating Deep Neural Networks with Input Sparsity
Dong, Xiao
Liu, Lei
Zhao, Peng
Li, Guangli
Li, Jiansong
Wang, Xueying
Feng, Xiaobing
2019 28TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT 2019), 2019, : 178 - 191
[24] A Unified Approximation Framework for Compressing and Accelerating Deep Neural Networks
Ma, Yuzhe
Chen, Ran
Li, Wei
Shang, Fanhua
Yu, Wenjian
Cho, Minsik
Yu, Bei
2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 376 - 383
[25] Accelerating Graph Neural Networks in Pytorch with HLS and Deep Dataflows
Nunez-Yanez, Jose
APPLIED RECONFIGURABLE COMPUTING. ARCHITECTURES, TOOLS, AND APPLICATIONS, ARC 2023, 2023, 14251 : 131 - 145
[26] Eager Falsification for Accelerating Robustness Verification of Deep Neural Networks
Guo, Xingwu
Wan, Wenjie
Zhang, Zhaodi
Zhang, Min
Song, Fu
Wen, Xuejun
2021 IEEE 32ND INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE 2021), 2021, : 345 - 356
[27] An Electro-Photonic System for Accelerating Deep Neural Networks
Demirkiran, Cansu
Eris, Furkan
Wang, Gongyu
Elmhurst, Jonathan
Moore, Nick
Harris, Nicholas C.
Basumallik, Ayon
Reddi, Vijay Janapa
Joshi, Ajay
Bunandar, Darius
ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2023, 19 (04)
[28] TensorFlow to Cloud FPGAs: Tradeoffs for Accelerating Deep Neural Networks
Hadjis, Stefan
Olukotun, Kunle
2019 29TH INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2019, : 360 - 366
[29] DeepRecon: Dynamically Reconfigurable Architecture for Accelerating Deep Neural Networks
Rzayev, Tayyar
Moradi, Saber
Albonesi, David H.
Manohar, Rajit
2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 116 - 124
[30] Soft Taylor Pruning for Accelerating Deep Convolutional Neural Networks
Rong, Jintao
Yu, Xiyi
Zhang, Mingyang
Ou, Linlin
IECON 2020: THE 46TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2020, : 5343 - 5349

← 1 2 3 4 5 →