An Efficient Accelerator for Multiple Convolutions From the Sparsity Perspective

被引:13
|
作者
Chen, Qinyu [1 ]
Huang, Yan [1 ]
Sun, Rui [1 ]
Song, Wenqing [1 ]
Lu, Zhonghai [2 ]
Fu, Yuxiang [1 ]
Li, Li [1 ]
机构
[1] Nanjing Univ, Sch Elect & Engn, Nanjing 210000, Peoples R China
[2] KTH Royal Inst Technol, S-11428 Stockholm, Sweden
关键词
Data processing; Computer architecture; Very large scale integration; Hardware; Registers; Microsoft Windows; Kernel; Dilated convolutions (DCONVs) and transposed convolutions (TCONVs); load balance; sparsity; VLSI; ARCHITECTURE;
D O I
10.1109/TVLSI.2020.2976454
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional neural networks (CNNs) have emerged as one of the most popular ways applied in many fields. These networks deliver better performance when going deeper and larger. However, the complicated computation and huge storage impede hardware implementation. To address the problem, quantized networks are proposed. Besides, various convolutional structures are designed to meet the requirements of different applications. For example, compared with the traditional convolutions (CONVs) for image classification, CONVs for image generation are usually composed of traditional CONVs, dilated CONVs, and transposed CONVs, leading to a difficult hardware mapping problem. In this brief, we translate the difficult mapping problem into the sparsity problem and propose an efficient hardware architecture for sparse binary and ternary CNNs by exploiting the sparsity and low bit-width characteristics. To this end, we propose an ineffectual data removing (IDR) mechanism to remove both the regular and irregular sparsity based on dual-channel processing elements (PEs). Besides, a flexible layered load balance (LLB) mechanism is introduced to alleviate the load imbalance. The accelerator is implemented with 65-nm technology with a core size of 2.56 mm(2). It can achieve 3.72-TOPS/W energy efficiency at 50.1 mW, which makes it a promising design for embedded devices.
引用
收藏
页码:1540 / 1544
页数:5
相关论文
共 50 条
  • [41] CondConv: Conditionally Parameterized Convolutions for Efficient Inference
    Yang, Brandon
    Bender, Gabriel
    Le, Quoc V.
    Ngiam, Jiquan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [42] Structured Convolutions for Efficient Neural Network Design
    Bhalgat, Yash
    Zhang, Yizhe
    Lin, Jamie Menjay
    Porikli, Fatih
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [43] Warped Convolutions: Efficient Invariance to Spatial Transformations
    Henriques, Joao F.
    Vedaldi, Andrea
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [44] STICKER: An Energy-Efficient Multi-Sparsity Compatible Accelerator for Convolutional Neural Networks in 65-nm CMOS
    Yuan, Zhe
    Liu, Yongpan
    Yue, Jinshan
    Yang, Yixiong
    Wang, Jingyu
    Feng, Xiaoyu
    Zhao, Jian
    Li, Xueqing
    Yang, Huazhong
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (02) : 465 - 477
  • [45] Efficient Table Border Segmentation with Asymmetric Convolutions
    Minouei, Mohammad
    Soheili, Mohammad Reza
    Stricker, Didier
    FOURTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2021), 2022, 12084
  • [46] SuperHCA: An Efficient Deep-Learning Edge Super-Resolution Accelerator With Sparsity-Aware Heterogeneous Core Architecture
    Hu, Zhicheng
    Zeng, Jiahao
    Zhao, Xin
    Zhou, Liang
    Chang, Liang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, : 5420 - 5431
  • [47] Effectiveness of business accelerator services in Turkey: from the perspective of startups
    Cubukcu, Ceren
    Gulsecen, Sevinc
    INTERNATIONAL JOURNAL OF INNOVATION AND LEARNING, 2019, 26 (04) : 426 - 443
  • [48] Skipping CNN Convolutions Through Efficient Memoization
    de Moura, Rafael Fao
    Santos, Paulo C.
    de Lima, Joao Paulo C.
    Alves, Marco A. Z.
    Beck, Antonio C. S.
    Carro, Luigi
    EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, SAMOS 2019, 2019, 11733 : 65 - 76
  • [49] Efficient sparsity adaptive changepoint estimation
    Moen, Per August Jarval
    Glad, Ingrid Kristine
    Tveten, Martin
    ELECTRONIC JOURNAL OF STATISTICS, 2024, 18 (02): : 3975 - 4038
  • [50] FullSparse: A Sparse-Aware GEMM Accelerator with Online Sparsity Prediction
    Yu, Jiangnan
    Fan, Yang
    Wang, Hanfei
    Qiao, Yuxuan
    Wu, Zheng
    Xiong, Xiankui
    Yao, Xiao
    Yao, Haidong
    Zhang, Yecheng
    PROCEEDINGS OF THE 21ST ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS 2024, CF 2024, 2024, : 298 - 301