An Efficient Accelerator for Multiple Convolutions From the Sparsity Perspective

被引:13
|
作者
Chen, Qinyu [1 ]
Huang, Yan [1 ]
Sun, Rui [1 ]
Song, Wenqing [1 ]
Lu, Zhonghai [2 ]
Fu, Yuxiang [1 ]
Li, Li [1 ]
机构
[1] Nanjing Univ, Sch Elect & Engn, Nanjing 210000, Peoples R China
[2] KTH Royal Inst Technol, S-11428 Stockholm, Sweden
关键词
Data processing; Computer architecture; Very large scale integration; Hardware; Registers; Microsoft Windows; Kernel; Dilated convolutions (DCONVs) and transposed convolutions (TCONVs); load balance; sparsity; VLSI; ARCHITECTURE;
D O I
10.1109/TVLSI.2020.2976454
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional neural networks (CNNs) have emerged as one of the most popular ways applied in many fields. These networks deliver better performance when going deeper and larger. However, the complicated computation and huge storage impede hardware implementation. To address the problem, quantized networks are proposed. Besides, various convolutional structures are designed to meet the requirements of different applications. For example, compared with the traditional convolutions (CONVs) for image classification, CONVs for image generation are usually composed of traditional CONVs, dilated CONVs, and transposed CONVs, leading to a difficult hardware mapping problem. In this brief, we translate the difficult mapping problem into the sparsity problem and propose an efficient hardware architecture for sparse binary and ternary CNNs by exploiting the sparsity and low bit-width characteristics. To this end, we propose an ineffectual data removing (IDR) mechanism to remove both the regular and irregular sparsity based on dual-channel processing elements (PEs). Besides, a flexible layered load balance (LLB) mechanism is introduced to alleviate the load imbalance. The accelerator is implemented with 65-nm technology with a core size of 2.56 mm(2). It can achieve 3.72-TOPS/W energy efficiency at 50.1 mW, which makes it a promising design for embedded devices.
引用
收藏
页码:1540 / 1544
页数:5
相关论文
共 50 条
  • [21] Towards CIM-friendly and Energy-Efficient DNN Accelerator via Bit-level Sparsity
    Karimzadeh, Foroozan
    Raychowdhury, Arijit
    PROCEEDINGS OF THE 2022 IFIP/IEEE 30TH INTERNATIONAL CONFERENCE ON VERY LARGE SCALE INTEGRATION (VLSI-SOC), 2022,
  • [22] Exploiting Structured Sparsity in Near Field: From the Perspective of Decomposition
    Guo, Xufeng
    Chen, Yuanbin
    Wang, Ying
    Yuen, Chau
    IEEE COMMUNICATIONS MAGAZINE, 2025, 63 (01) : 37 - 43
  • [23] VSCNN: Convolution Neural Network Accelerator With Vector Sparsity
    Chang, Kuo-Wei
    Chang, Tian-Sheuan
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [24] EFFICIENT DEALIASED CONVOLUTIONS WITHOUT PADDING
    Bowman, John C.
    Roberts, Malcolm
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2011, 33 (01): : 386 - 406
  • [25] A note on efficient density estimators of convolutions
    Bandyopadhyay, Soutir
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2012, 142 (11) : 3056 - 3060
  • [26] Understanding the accelerator from resources-based perspective
    Uhm, Chul Hyun
    Sung, Chang Soo
    Park, Joo Yeon
    ASIA PACIFIC JOURNAL OF INNOVATION AND ENTREPRENEURSHIP, 2018, 12 (03) : 258 - 278
  • [27] Intragroup sparsity for efficient inference
    Yu, Zilin
    Wang, Chao
    Wang, Xin
    Zhao, Yong
    Wu, Xundong
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [28] Multiple Dirichlet series and shifted convolutions
    Hoffstein, Jeff
    Hulse, Thomas A.
    Reznikov, Andre
    JOURNAL OF NUMBER THEORY, 2016, 161 : 457 - 533
  • [29] Efficient Distributed Learning with Sparsity
    Wang, Jialei
    Kolar, Mladen
    Srebro, Nathan
    Zhang, Tong
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [30] An Energy-Efficient Unstructured Sparsity-Aware Deep SNN Accelerator With 3-D Computation Array
    Fang, Chaoming
    Shen, Ziyang
    Wang, Zongsheng
    Wang, Chuanqing
    Zhao, Shiqi
    Tian, Fengshi
    Yang, Jie
    Sawan, Mohamad
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2025, 60 (03) : 977 - 989