Accelerating CNN Algorithm with Fine-grained Dataflow Architectures

被引:3
|
作者
Xiang, Taoran [1 ,2 ]
Feng, Yujing [1 ]
Ye, Xiaochun [1 ]
Tan, Xu [1 ,2 ]
Li, Wenming [1 ]
Zhu, Yatao [1 ]
Wu, Meng [1 ]
Zhang, Hao [1 ]
Fan, Dongrui [1 ,2 ]
机构
[1] Chinese Acad Sci, ICT, State Key Lab Comp Architecture, Beijing, Peoples R China
[2] UCAS, Sch Comp & Control Engn, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
fine-grained dataflow; Convolutional Neural Network; general accelerator; data reuse; high parallel;
D O I
10.1109/HPCC/SmartCity/DSS.2018.00063
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional Neural Network(CNN) is a hot and state-of-the-art algorithm which is widely used in applications such as face recognition, intelligent monitoring, image recognition and text recognition. Because of its high computational complexity, many efficient hardware accelerators have been proposed to exploit high degree of parallel processing for CNN. However, accelerators which are implemented on FPGAs and ASICs usually sacrifice generality for higher performance and lower power consumption. Other accelerators, such as GPUs, are general enough, but they lead to higher power consumption. Fine-grained dataflow architectures, which break conventional Von Neumann architectures, show natural advantages in processing CNN-like algorithms with high computational efficiency and low power consumption. At the same time, it remains broadly applicable and adaptable. In this paper, we propose a scheme for implementing and optimizing CNN on fine-grained dataflow architecture based accelerators. The experiment results reveal that by using our scheme, the performance of AlexNet running on the dataflow accelerator is 3.11x higher than that on NVIDIA Tesla K80, and the power consumption of our hardware is 8.52x lower than that of K80.
引用
收藏
页码:243 / 251
页数:9
相关论文
共 50 条
  • [1] Fine-Grained Synchronizations and Dataflow Programming on GPUs
    Li, Ang
    van den Braak, Gert-Jan
    Corporaal, Henk
    Kumar, Akash
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS'15), 2015, : 109 - 118
  • [2] Leveraging Fine-grained Structured Sparsity for CNN Inference on Systolic Array Architectures
    Liu, Linqiao
    Brown, Stephen
    2021 31ST INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS (FPL 2021), 2021, : 301 - 305
  • [3] HyConv: Accelerating Multi-Phase CNN Computation by Fine-Grained Policy Selection
    Li, Xiaqing
    Zhang, Guangyan
    Wang, Zhufan
    Zheng, Weimin
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (02) : 388 - 399
  • [4] A Fine-grained Performance Model for GPU Architectures
    Bombieri, Nicola
    Busato, Federico
    Fummi, Franco
    PROCEEDINGS OF THE 2016 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2016, : 1267 - 1272
  • [5] Towards Fine-Grained Dataflow Parallelism in Big Data Systems
    Ertel, Sebastian
    Adam, Justus
    Castrillon, Jeronimo
    LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, LCPC 2017, 2019, 11403 : 281 - 282
  • [6] Accelerating RSA with Fine-Grained Parallelism Using GPU
    Yang, Yang
    Guan, Zhi
    Sun, Huiping
    Chen, Zhong
    INFORMATION SECURITY PRACTICE AND EXPERIENCE, ISPEC 2015, 2015, 9065 : 454 - 468
  • [7] Bilinear CNN Models for Fine-grained Visual Recognition
    Lin, Tsung-Yu
    RoyChowdhury, Aruni
    Maji, Subhransu
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1449 - 1457
  • [8] Fault Diagnosis of Gearbox in Multiple Conditions Based on Fine-Grained Classification CNN Algorithm
    Jiang, Pengcheng
    Cong, Hua
    Wang, Jing
    Zhang, Dongsheng
    SHOCK AND VIBRATION, 2020, 2020
  • [9] Fast Attention CNN for Fine-Grained Crack Segmentation
    Lee, Hyunnam
    Yoo, Juhan
    SENSORS, 2023, 23 (04)
  • [10] Strengthening Component Architectures by Modeling Fine-grained Entities
    Bures, Tomas
    Jezek, Pavel
    Malohlava, Michal
    Poch, Tomas
    Sery, Ondrej
    2011 37TH EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS (SEAA 2011), 2011, : 124 - 128