A Dynamically Reconfigurable Accelerator Design Using a Sparse-Winograd Decomposition Algorithm for CNNs

被引:2
|
作者
Zhao, Yunping [1 ]
Lu, Jianzhuang [1 ]
Chen, Xiaowen [1 ]
机构
[1] Natl Univ Def Technol, Changsha, Peoples R China
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2021年 / 66卷 / 01期
关键词
High performance computing; accelerator architecture; hardware; NEURAL-NETWORK; CONVOLUTION;
D O I
10.32604/cmc.2020.012380
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional Neural Networks (CNNs) are widely used in many fields. Due to their high throughput and high level of computing characteristics, however, an increasing number of researchers are focusing on how to improve the computational efficiency, hardware utilization, or flexibility of CNN hardware accelerators. Accordingly, this paper proposes a dynamically reconfigurable accelerator architecture that implements a Sparse-Winograd F(2 x 2.3 x 3)-based high-parallelism hardware architecture. This approach not only eliminates the pre-calculation complexity associated with the Winograd algorithm, thereby reducing the difficulty of hardware implementation, but also greatly improves the flexibility of the hardware; as a result, the accelerator can realize the calculation of Conventional Convolution, Grouped Convolution (GCONV) or Depthwise Separable Convolution (DSC) using the same hardware architecture. Our experimental results show that the accelerator achieves a 3x-4.14x speedup compared with the designs that do not use the acceleration algorithm on VGG-16 and MobileNet V1. Moreover, compared with previous designs using the traditional Winograd algorithm, the accelerator design achieves 1.4x-1.8x speedup. At the same time, the efficiency of the multiplier improves by up to 142%.
引用
收藏
页码:517 / 535
页数:19
相关论文
共 50 条
  • [21] Towards the Generic Reconfigurable Accelerator: Algorithm Development, Core Design, and Performance Analysis
    Navas, Byron
    Oberg, Johnny
    Sander, Ingo
    2013 INTERNATIONAL CONFERENCE ON RECONFIGURABLE COMPUTING AND FPGAS (RECONFIG), 2013,
  • [22] A FPGA-based accelerator implementaion for YOLOv2 object detection using Winograd algorithm
    Lv, Peng
    Liu, Wei
    Li, Jinghui
    2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 1894 - 1898
  • [23] Sparse Decomposition Algorithm Using Immune Matching Pursuit
    Zhou, Yan
    Zhao, Heming
    Liu, Tao
    PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 489 - +
  • [24] Energy-Efficient Accelerator Design with 3D-SRAM and Hierarchical Interconnection Architecture for Compact Sparse CNNs
    Lo, Chin-Yang
    Huang, Po-Tsang
    Hwang, Wei
    2020 2ND IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2020), 2020, : 320 - 323
  • [25] FPGA implementation of dynamically reconfigurable IoT security module using algorithm hopping
    Soliman, Shady
    Jaela, Mohammed A.
    Abotaleb, Abdelrhman M.
    Hassan, Youssef
    Abdelghany, Mohamed A.
    Abdel-Hamid, Amr T.
    Salama, Khaled N.
    Mostafa, Hassan
    INTEGRATION-THE VLSI JOURNAL, 2019, 68 : 108 - 121
  • [26] WinoNN: Optimizing FPGA-Based Convolutional Neural Network Accelerators Using Sparse Winograd Algorithm
    Wang, Xuan
    Wang, Chao
    Cao, Jing
    Gong, Lei
    Zhou, Xuehai
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (11) : 4290 - 4302
  • [27] FPGA-Based Reconfigurable Convolutional Neural Network Accelerator Using Sparse and Convolutional Optimization
    Gowda, Kavitha Malali Vishveshwarappa
    Madhavan, Sowmya
    Rinaldi, Stefano
    Divakarachari, Parameshachari Bidare
    Atmakur, Anitha
    ELECTRONICS, 2022, 11 (10)
  • [28] On the Design of Highly Reliable System-on-Chip using Dynamically Reconfigurable FPGAs
    Du, Boyang
    Sterpone, Luca
    Venditti, Lorenzo
    Codinachs, David Merodio
    2015 10TH INTERNATIONAL SYMPOSIUM ON RECONFIGURABLE COMMUNICATION-CENTRIC SYSTEMS-ON-CHIP (RECOSOC), 2015,
  • [29] A dynamically and partially reconfigurable implementation of the IDEA algorithm using FPGAs and handel-C
    Granado-Criado, Jose M.
    Vega-Rodriguez, Miguel A.
    Sanchez-Perez, Juan M.
    Gomez-Pulido, Juan A.
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2007, 13 (03) : 407 - 418
  • [30] Reconfigurable MIMO Transceiver Design using the Tunable Channel Decomposition
    Wang, Jing
    Sobelman, Gerald E.
    2010 CONFERENCE RECORD OF THE FORTY FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2010, : 381 - 384