A Dynamically Reconfigurable Accelerator Design Using a Sparse-Winograd Decomposition Algorithm for CNNs

被引:2
|
作者
Zhao, Yunping [1 ]
Lu, Jianzhuang [1 ]
Chen, Xiaowen [1 ]
机构
[1] Natl Univ Def Technol, Changsha, Peoples R China
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2021年 / 66卷 / 01期
关键词
High performance computing; accelerator architecture; hardware; NEURAL-NETWORK; CONVOLUTION;
D O I
10.32604/cmc.2020.012380
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional Neural Networks (CNNs) are widely used in many fields. Due to their high throughput and high level of computing characteristics, however, an increasing number of researchers are focusing on how to improve the computational efficiency, hardware utilization, or flexibility of CNN hardware accelerators. Accordingly, this paper proposes a dynamically reconfigurable accelerator architecture that implements a Sparse-Winograd F(2 x 2.3 x 3)-based high-parallelism hardware architecture. This approach not only eliminates the pre-calculation complexity associated with the Winograd algorithm, thereby reducing the difficulty of hardware implementation, but also greatly improves the flexibility of the hardware; as a result, the accelerator can realize the calculation of Conventional Convolution, Grouped Convolution (GCONV) or Depthwise Separable Convolution (DSC) using the same hardware architecture. Our experimental results show that the accelerator achieves a 3x-4.14x speedup compared with the designs that do not use the acceleration algorithm on VGG-16 and MobileNet V1. Moreover, compared with previous designs using the traditional Winograd algorithm, the accelerator design achieves 1.4x-1.8x speedup. At the same time, the efficiency of the multiplier improves by up to 142%.
引用
收藏
页码:517 / 535
页数:19
相关论文
共 50 条
  • [41] Dynamically Reconfigurable NoC using a Deadlock-Free Flexible Routing Algorithm with a Low Hardware Implementation Cost
    Castillo, Ernesto Villegas
    Chau, Wang Jiang
    Miorandi, Gabriele
    Bertozzi, Davide
    2015 IEEE 6TH LATIN AMERICAN SYMPOSIUM ON CIRCUITS & SYSTEMS (LASCAS), 2015,
  • [42] SPARSE CODE MULTIPLE ACCESS CODEBOOK DESIGN USING SINGULAR VALUE DECOMPOSITION
    Vidal Beltran, S.
    Carreno Aguilera, R.
    Lopez Bonilla, J. L.
    FRACTALS-COMPLEX GEOMETRY PATTERNS AND SCALING IN NATURE AND SOCIETY, 2020, 28 (07)
  • [43] Design Optimization of Combined Function Accelerator Magnet Using Truncated Singular Value Decomposition
    Dhakarwal, M.
    Abe, M.
    Ogitsu, T.
    Sugano, M.
    IEEE TRANSACTIONS ON APPLIED SUPERCONDUCTIVITY, 2023, 33 (05)
  • [44] Sublogarithmic distributed MIS algorithm for sparse graphs using Nash-Williams decomposition
    Barenboim, Leonid
    Elkin, Michael
    DISTRIBUTED COMPUTING, 2010, 22 (5-6) : 363 - 379
  • [45] Noise suppression for electronic images using variational mode decomposition and sparse SURE algorithm
    Li Q.
    Liang S.Y.
    Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2020, 52 (04): : 101 - 111
  • [46] Sublogarithmic distributed MIS algorithm for sparse graphs using Nash-Williams decomposition
    Leonid Barenboim
    Michael Elkin
    Distributed Computing, 2010, 22 : 363 - 379
  • [47] Sublogarithmic Distributed MIS Algorithm for Sparse Graphs using Nash-Williams Decomposition
    Barenboim, Leonid
    Elkin, Michael
    PODC'08: PROCEEDINGS OF THE 27TH ANNUAL ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, 2008, : 25 - 34
  • [48] Sparse FIR Filter Design Using Artificial Bee Colony Algorithm
    Raju, Rija
    Kwan, Hon Keung
    Jiang, Aimin
    2018 IEEE 61ST INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2018, : 956 - 959
  • [49] DyAFNoC: Characterization and Analysis of a Dynamically Reconfigurable NoC using a DOR-based Deadlock-Free Routing Algorithm
    Castillo, Ernesto Villegas
    Miorandi, Gabriele
    Chau, Wang Jiang
    2014 EIGHTH IEEE/ACM INTERNATIONAL SYMPOSIUM ON NETWORKS-ON-CHIP (NOCS), 2014, : 190 - 191
  • [50] An efficient lightweight algorithm for scheduling tasks onto dynamically reconfigurable hardware using graph-oriented simulated annealing
    Mollajafari, Morteza
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (24): : 18035 - 18057