Accelerating CNN Algorithm with Fine-grained Dataflow Architectures

被引:3
|
作者
Xiang, Taoran [1 ,2 ]
Feng, Yujing [1 ]
Ye, Xiaochun [1 ]
Tan, Xu [1 ,2 ]
Li, Wenming [1 ]
Zhu, Yatao [1 ]
Wu, Meng [1 ]
Zhang, Hao [1 ]
Fan, Dongrui [1 ,2 ]
机构
[1] Chinese Acad Sci, ICT, State Key Lab Comp Architecture, Beijing, Peoples R China
[2] UCAS, Sch Comp & Control Engn, Beijing, Peoples R China
来源
IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS) | 2018年
基金
中国国家自然科学基金;
关键词
fine-grained dataflow; Convolutional Neural Network; general accelerator; data reuse; high parallel;
D O I
10.1109/HPCC/SmartCity/DSS.2018.00063
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional Neural Network(CNN) is a hot and state-of-the-art algorithm which is widely used in applications such as face recognition, intelligent monitoring, image recognition and text recognition. Because of its high computational complexity, many efficient hardware accelerators have been proposed to exploit high degree of parallel processing for CNN. However, accelerators which are implemented on FPGAs and ASICs usually sacrifice generality for higher performance and lower power consumption. Other accelerators, such as GPUs, are general enough, but they lead to higher power consumption. Fine-grained dataflow architectures, which break conventional Von Neumann architectures, show natural advantages in processing CNN-like algorithms with high computational efficiency and low power consumption. At the same time, it remains broadly applicable and adaptable. In this paper, we propose a scheme for implementing and optimizing CNN on fine-grained dataflow architecture based accelerators. The experiment results reveal that by using our scheme, the performance of AlexNet running on the dataflow accelerator is 3.11x higher than that on NVIDIA Tesla K80, and the power consumption of our hardware is 8.52x lower than that of K80.
引用
收藏
页码:243 / 251
页数:9
相关论文
共 50 条
  • [21] Fine-Grained Ship Classification by Combining CNN and Swin Transformer
    Huang, Liang
    Wang, Fengxiang
    Zhang, Yalun
    Xu, Qingxia
    REMOTE SENSING, 2022, 14 (13)
  • [22] LR-CNN FOR FINE-GRAINED CLASSIFICATION WITH VARYING RESOLUTION
    Chevalier, M.
    Thome, N.
    Cord, M.
    Fournier, J.
    Henaff, G.
    Dusch, E.
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 3101 - 3105
  • [23] Kernelized Bilinear CNN Models for Fine-Grained Visual Recognition
    Ge S.-Y.
    Gao Z.-L.
    Zhang B.-B.
    Li P.-H.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2019, 47 (10): : 2134 - 2141
  • [24] An FPGA Overlay for CNN Inference with Fine-grained Flexible Parallelism
    Choudhury, Ziaul
    Shrivastava, Shashwat
    Ramapantulu, Lavanya
    Purini, Suresh
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2022, 19 (03)
  • [25] Fine-Grained Intoxicated Gait Classification Using a Bilinear CNN
    Li, Ruojun
    Agu, Emmanuel
    Sarwar, Atifa
    Grimone, Kristin
    Herman, Debra
    Abrantes, Ana M.
    Stein, Michael D.
    IEEE SENSORS JOURNAL, 2023, 23 (23) : 29733 - 29748
  • [26] Part-Stacked CNN for Fine-Grained Visual Categorization
    Huang, Shaoli
    Xu, Zhe
    Tao, Dacheng
    Zhang, Ya
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1173 - 1182
  • [27] Multi-Scale CNN for Fine-Grained Image Recognition
    Won, Chee Sun
    IEEE ACCESS, 2020, 8 : 116663 - 116674
  • [28] A FINE-GRAINED PARALLEL MEMORY COMPACTION ALGORITHM
    WEEMEEUW, P
    DEMOEN, B
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1994, 20 (02) : 176 - 186
  • [29] Fine-Grained Accident Detection: Database and Algorithm
    Yu, Hongyang
    Zhang, Xinfeng
    Wang, Yaowei
    Huang, Qingming
    Yin, Baocai
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1059 - 1069
  • [30] Fine-grained Differential Harmony Search Algorithm
    Lin, Xiaoyu
    Zhong, Yiwen
    Wang, Yingxu
    PROCEEDINGS OF 2015 IEEE 14TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC), 2015, : 59 - 66