DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion

被引:67
|
作者
Niu, Wei [1 ]
Guan, Jiexiong [1 ]
Wang, Yanzhi [2 ]
Agrawal, Gagan [3 ]
Ren, Bin [1 ]
机构
[1] William & Mary, Williamsburg, VA 23185 USA
[2] Northeastern Univ, Boston, MA 02115 USA
[3] Augusta Univ, Augusta, GA USA
来源
PROCEEDINGS OF THE 42ND ACM SIGPLAN INTERNATIONAL CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '21) | 2021年
基金
美国国家科学基金会;
关键词
Compiler Optimization; Operator Fusion; Deep Neural Network; Mobile Devices; TRANSFORMATIONS; OPTIMIZATION; LOCALITY; LOOP;
D O I
10.1145/3453483.3454083
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Deep Neural Networks (DNNs) have emerged as the core enabler of many major applications on mobile devices. To achieve high accuracy, DNN models have become increasingly deep with hundreds or even thousands of operator layers, leading to high memory and computational requirements for inference. Operator fusion (or kernel/layer fusion) is key optimization in many state-of-the-art DNN execution frameworks, such as TensorFlow, TVM, and MNN, that aim to improve the efficiency of the DNN inference. However, these frameworks usually adopt fusion approaches based on certain patterns that are too restrictive to cover the diversity of operators and layer connections, especially those seen in many extremely deep models. Polyhedral-based loop fusion techniques, on the other hand, work on a low-level view of the computation without operator-level information, and can also miss potential fusion opportunities. To address this challenge, this paper proposes a novel and extensive loop fusion framework called DNNFusion. The basic idea of this work is to work at an operator view of DNNs, but expand fusion opportunities by developing a classification of both individual operators and their combinations. In addition, DNNFusion includes 1) a novel mathematical-property-based graph rewriting framework to reduce evaluation costs and facilitate subsequent operator fusion, 2) an integrated fusion plan generation that leverages the high-level analysis and accurate light-weight profiling, and 3) additional optimizations during fusion code generation. DNNFusion is extensively evaluated on 15 DNN models with varied types of tasks, model sizes, and layer counts. The evaluation results demonstrate that DNNFusion finds up to 8.8x higher fusion opportunities, outperforms four state-of-the-art DNN execution frameworks with 9.3x speedup. The memory requirement reduction and speedups can enable the execution of many of the target models on mobile devices and even make them part of a real-time application.
引用
收藏
页码:883 / 898
页数:16
相关论文
共 50 条
  • [41] Exploring Heterogeneous Algorithms for Accelerating Deep Convolutional Neural Networks on FPGAs
    Xiao, Qincheng
    Liang, Yun
    Lu, Liqiang
    Yan, Shengen
    Tai, Yu-Wing
    PROCEEDINGS OF THE 2017 54TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2017,
  • [42] Accelerating the characterization of dynamic DNA origami devices with deep neural networks
    Yuchen Wang
    Xin Jin
    Carlos Castro
    Scientific Reports, 13
  • [43] EmbRace: Accelerating Sparse Communication for Distributed Training of Deep Neural Networks
    Li, Shengwei
    Lai, Zhiquan
    Li, Dongsheng
    Zhang, Yiming
    Ye, Xiangyu
    Duan, Yabo
    51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022, 2022,
  • [44] Accelerating and Compressing Deep Neural Networks for Massive MIMO CSI Feedback
    Erak, Omar
    Abou-Zeid, Hatem
    ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 1029 - 1035
  • [45] Accelerating the characterization of dynamic DNA origami devices with deep neural networks
    Wang, Yuchen
    Jin, Xin
    Castro, Carlos
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [46] Accelerating Deep Convolutional Neural Networks Using Number Theoretic Transform
    Hong, Seongmin
    Arthanto, Yashael Faith
    Kim, Joo-Young
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (01) : 315 - 326
  • [47] Accelerating Training of Deep Neural Networks via Sparse Edge Processing
    Dey, Sourya
    Shao, Yinan
    Chugg, Keith M.
    Beerel, Peter A.
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2017, PT I, 2017, 10613 : 273 - 280
  • [48] Model fusion of deep neural networks for anomaly detection
    AlDahoul, Nouar
    Karim, Hezerul Abdul
    Wazir, Abdulaziz Saleh Ba
    JOURNAL OF BIG DATA, 2021, 8 (01)
  • [49] Multilevel Features Fusion in Deep Convolutional Neural Networks
    Zhuo, Yi-Fan
    Wang, Yi-Lei
    CLOUD COMPUTING AND SECURITY, PT VI, 2018, 11068 : 600 - 610
  • [50] Progressive spatiotemporal image fusion with deep neural networks
    Cai, Jiajun
    Huang, Bo
    Fung, Tung
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2022, 108