A Hardware-Friendly High-Precision CNN Pruning Method and Its FPGA Implementation

被引:5
|
作者
Sui, Xuefu [1 ,2 ,3 ]
Lv, Qunbo [1 ,2 ,3 ]
Zhi, Liangjie [1 ,2 ,3 ]
Zhu, Baoyu [1 ,2 ,3 ]
Yang, Yuanbo [1 ,2 ,3 ]
Zhang, Yu [1 ,2 ,3 ]
Tan, Zheng [1 ,3 ]
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst, 9 Dengzhuang South Rd, Beijing 100094, Peoples R China
[2] Univ Chinese Acad Sci, Sch Optoelect, 19A Yuquan Rd, Beijing 100049, Peoples R China
[3] Chinese Acad Sci, Key Lab Computat Opt Imagine Technol, Dept, 9 Dengzhuang South Rd, Beijing 100094, Peoples R China
关键词
convolutional neural networks; hardware friendly; network compression; regular pruning; LR tracking; high parallelism; CONVOLUTIONAL NEURAL-NETWORK;
D O I
10.3390/s23020824
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
To address the problems of large storage requirements, computational pressure, untimely data supply of off-chip memory, and low computational efficiency during hardware deployment due to the large number of convolutional neural network (CNN) parameters, we developed an innovative hardware-friendly CNN pruning method called KRP, which prunes the convolutional kernel on a row scale. A new retraining method based on LR tracking was used to obtain a CNN model with both a high pruning rate and accuracy. Furthermore, we designed a high-performance convolutional computation module on the FPGA platform to help deploy KRP pruning models. The results of comparative experiments on CNNs such as VGG and ResNet showed that KRP has higher accuracy than most pruning methods. At the same time, the KRP method, together with the GSNQ quantization method developed in our previous study, forms a high-precision hardware-friendly network compression framework that can achieve "lossless" CNN compression with a 27x reduction in network model storage. The results of the comparative experiments on the FPGA showed that the KRP pruning method not only requires much less storage space, but also helps to reduce the on-chip hardware resource consumption by more than half and effectively improves the parallelism of the model in FPGAs with a strong hardware-friendly feature. This study provides more ideas for the application of CNNs in the field of edge computing.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] A hardware-friendly logarithmic quantization method for CNNs and FPGA implementation
    Jiang, Tao
    Xing, Ligang
    Yu, Jinming
    Qian, Junchao
    [J]. JOURNAL OF REAL-TIME IMAGE PROCESSING, 2024, 21 (04)
  • [2] SMOF: Squeezing More Out of Filters Yields Hardware-Friendly CNN Pruning
    Liu, Yanli
    Guan, Bochen
    Li, Weiyi
    Xu, Qinwen
    Quan, Shuxue
    [J]. ARTIFICIAL INTELLIGENCE, CICAI 2022, PT I, 2022, 13604 : 242 - 254
  • [3] A Hardware-Friendly Low-Bit Power-of-Two Quantization Method for CNNs and Its FPGA Implementation
    Sui, Xuefu
    Lv, Qunbo
    Bai, Yang
    Zhu, Baoyu
    Zhi, Liangjie
    Yang, Yuanbo
    Tan, Zheng
    [J]. SENSORS, 2022, 22 (17)
  • [4] Hardware-Friendly Approximation for Swish Activation and Its Implementation
    Choi, Kangjoon
    Kim, Sungho
    Kim, Jeongmin
    Park, In-Cheol
    [J]. IEEE Transactions on Circuits and Systems II: Express Briefs, 2024, 71 (10) : 4516 - 4520
  • [5] A Method of FPGA-Based Extraction of High-Precision Time-Difference Information and Implementation of Its Hardware Circuit
    Li, Jian
    Yan, Xinlei
    Li, Maojin
    Meng, Ming
    Yan, Xin
    [J]. SENSORS, 2019, 19 (23)
  • [6] HRBP: Hardware-friendly Regrouping towards Block-based Pruning for Sparse CNN Training
    Ma, Haoyu
    Zhang, Chengming
    Xiang, Lizhi
    Ma, Xiaolong
    Yuan, Geng
    Zhang, Wenkai
    Liu, Shiwei
    Chen, Tianlong
    Tao, Dingwen
    Wang, Yanzhi
    Wang, Zhangyang
    Xie, Xiaohui
    [J]. CONFERENCE ON PARSIMONY AND LEARNING, VOL 234, 2024, 234 : 282 - 301
  • [7] A high-precision Wave Union TDC implementation in FPGA
    Caponio, F.
    Abba, A.
    Lusardi, N.
    Geraci, A.
    [J]. 2013 IEEE NUCLEAR SCIENCE SYMPOSIUM AND MEDICAL IMAGING CONFERENCE (NSS/MIC), 2013,
  • [8] HFPQ: deep neural network compression by hardware-friendly pruning-quantization
    YingBo Fan
    Wei Pang
    ShengLi Lu
    [J]. Applied Intelligence, 2021, 51 : 7016 - 7028
  • [9] Single-shot pruning and quantization for hardware-friendly neural network acceleration
    Jiang, Bofeng
    Chen, Jun
    Liu, Yong
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [10] LOW PRECISION LOCAL LEARNING FOR HARDWARE-FRIENDLY NEUROMORPHIC VISUAL RECOGNITION
    Acharya, Jyotibdha
    Iyer, Laxmi R.
    Jiang, Wenyu
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8937 - 8941