An Efficient CNN Inference Accelerator Based on Intra- and Inter-Channel Feature Map Compression

被引:1
|
作者
Xie, Chenjia [1 ]
Shao, Zhuang [1 ]
Zhao, Ning [1 ]
Du, Yuan [1 ]
Du, Li [1 ]
机构
[1] Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210023, Peoples R China
关键词
Deep convolution neural networks; interlayer feature map compression; principal component analysis; DEEP; NETWORKS;
D O I
10.1109/TCSI.2023.3287602
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep convolutional neural networks (CNNs) generate intensive inter-layer data during inference, which results in substantial on-chip memory size and off-chip bandwidth. To solve the memory constraint, this paper proposes an accelerator adopting a compression technique that can reduce the inter-layer data by removing both intra-and inter-channel redundant information. Principal component analysis (PCA) is utilized in the compression process to concentrate inter-channel information. The spatial differences, truncation, and reconfigurable bit-width coding are implemented inside every feature map to eliminate the intra-channel data redundancy. Moreover, a particular data arrangement is introduced to enhance data continuity to optimize PCA analysis and improve compression performance. A CNN accelerator with the proposed compression technique is designed to support the on-the-fly compression process by pipelining the reconstruction, CNN computation, and compression operation. The prototype accelerator is implemented using 28-nm CMOS technology. It achieves 819.2GOPS peak throughput and 3.75TOPS/W energy efficiency with 218.5mW. Experiments show that the proposed compression technique achieves compression ratios of 21.5% similar to 43.0% (8-bit mode) and 9.8% similar to 19.3% (16-bit mode) on state-of-the-art CNNs with a negligible accuracy loss.
引用
收藏
页码:3625 / 3638
页数:14
相关论文
共 25 条
  • [1] Memory-Efficient CNN Accelerator Based on Interlayer Feature Map Compression
    Shao, Zhuang
    Chen, Xiaoliang
    Du, Li
    Chen, Lei
    Du, Yuan
    Zhuang, Wei
    Wei, Huadong
    Xie, Chenjia
    Wang, Zhongfeng
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2022, 69 (02) : 668 - 681
  • [2] Transform-Based Feature Map Compression for CNN Inference
    Shi, Yubo
    Wang, Meiqi
    Chen, Siyi
    Wei, Jinghe
    Wang, Zhongfeng
    [J]. 2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [3] CNN Inference Accelerators with Adjustable Feature Map Compression Ratios
    Tsai, Yu-Chih
    Liu, Chung-Yueh
    Wang, Chia-Chun
    Hsu, Tsen-Wei
    Liu, Ren-Shuo
    [J]. 2023 IEEE 41ST INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, ICCD, 2023, : 631 - 634
  • [4] Intra- and inter-channel noise evolution along nonlinear, lossy and dispersive optical fibers
    Fan, CC
    [J]. 2002 IEEE/LEOS ANNUAL MEETING CONFERENCE PROCEEDINGS, VOLS 1 AND 2, 2002, : 139 - 140
  • [5] Chroma Intra Prediction Based on Inter-Channel Correlation for HEVC
    Zhang, Xingyu
    Gisquet, Christophe
    Francois, Edouard
    Zou, Feng
    Au, Oscar C.
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (01) : 274 - 286
  • [6] Feature Map Transform Coding for Energy-Efficient CNN Inference
    Chmiel, Brian
    Baskin, Chaim
    Zheltonozhskii, Evgenii
    Banner, Ron
    Yermolin, Yevgeny
    Karbachevsky, Alex
    Bronstein, Alex M.
    Mendelson, Avi
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [7] Proposal of compensation for intra- and inter-channel nonlinear distortions using optical compensation and nonlinear equalisation
    Kawahara, H.
    Yamamoto, S.
    Fukutoku, M.
    [J]. ELECTRONICS LETTERS, 2016, 52 (17) : 1471 - U101
  • [8] A Sparse CNN Accelerator for Eliminating Redundant Computations in Intra- and Inter-Convolutional/Pooling Layers
    Yang, Chen
    Meng, Yishuo
    Huo, Kaibo
    Xi, Jiawei
    Mei, Kuizhi
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2022, 30 (12) : 1902 - 1915
  • [9] Intra- Versus Inter-Channel PMD in Linearly Compensated Coherent PDM-PSK Nonlinear Transmissions
    Serena, P.
    Rossi, N.
    Bertran-Pardo, O.
    Renaudier, J.
    Vannucci, A.
    Bononi, A.
    [J]. JOURNAL OF LIGHTWAVE TECHNOLOGY, 2011, 29 (11) : 1691 - 1700
  • [10] Attention-based Feature Compression for CNN Inference Offloading in Edge Computing
    Li, Nan
    Iosifidis, Alexandros
    Zhang, Qi
    [J]. ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 967 - 972