CNN Inference Accelerators with Adjustable Feature Map Compression Ratios

被引:0
|
作者
Tsai, Yu-Chih [1 ]
Liu, Chung-Yueh [1 ]
Wang, Chia-Chun [1 ]
Hsu, Tsen-Wei [1 ]
Liu, Ren-Shuo [1 ]
机构
[1] Natl Tsing Hua Univ, Dept Elect Engn, Hsinchu, Taiwan
关键词
CNN; Inference-Time Adjustability; Feature Map Compression; Memory Bandwidth; Hardware Accelerator;
D O I
10.1109/ICCD58817.2023.00099
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, an increasing interest has been in developing a convolution neural network (CNN) with adjustable configurations, enabling instant adaption to different resource constraints during inference. The trained CNN in run-time can switch to different modes to achieve a certain accuracy-energy trade-off point, similar to DVFS (dynamic voltage and frequency scaling) and turbo boost, which are widely adopted in CPUs. In this paper, we propose strategies to enable CNN inference accelerators to have an adjustable feature map compression ratio, making them tunable regarding their external memory access amount. We resort to the mature JPEG technique to compress those intermediate feature maps. The critical challenge is to support such adjustable compression ratios using one single CNN instead of multiple CNNs corresponding to multiple ratios. In response, we propose compression-aware joint-training and switchable batch normalization. We use ResNet18, ResNet50, and MobileNetV2 on ImageNet to demonstrate our design, achieve inference-time compression ratio adjustability, and reduce external memory access bandwidth requirements. The result shows that our proposed strategies can maintain the Top-1 accuracy and reduce external memory access by at most 22.7x similar to 28.3x only using a single CNN model with sets of BN parameters corresponding to multiple compression ratios.
引用
收藏
页码:631 / 634
页数:4
相关论文
共 50 条
  • [1] Transform-Based Feature Map Compression for CNN Inference
    Shi, Yubo
    Wang, Meiqi
    Chen, Siyi
    Wei, Jinghe
    Wang, Zhongfeng
    [J]. 2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [2] An Efficient CNN Inference Accelerator Based on Intra- and Inter-Channel Feature Map Compression
    Xie, Chenjia
    Shao, Zhuang
    Zhao, Ning
    Du, Yuan
    Du, Li
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (09) : 3625 - 3638
  • [3] Automated feature map padding and transfer circuit for CNN inference
    Zhang, Hongying
    Chen, Ming
    Ni, Mao
    Chen, Lan
    Zhang, Yiheng
    Hao, Xiaoran
    [J]. IEICE Electronics Express, 2024, 21 (22):
  • [4] A Feature Map Lossless Compression Framework for Convolutional Neural Network Accelerators
    Zhang, Zekun
    Jiao, Xin
    Xu, Chengyu
    [J]. 2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024, 2024, : 422 - 426
  • [5] Feature Map Transform Coding for Energy-Efficient CNN Inference
    Chmiel, Brian
    Baskin, Chaim
    Zheltonozhskii, Evgenii
    Banner, Ron
    Yermolin, Yevgeny
    Karbachevsky, Alex
    Bronstein, Alex M.
    Mendelson, Avi
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [6] Performance Modeling for CNN Inference Accelerators on FPGA
    Ma, Yufei
    Cao, Yu
    Vrudhula, Sarma
    Seo, Jae-Sun
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (04) : 843 - 856
  • [7] Attention-based Feature Compression for CNN Inference Offloading in Edge Computing
    Li, Nan
    Iosifidis, Alexandros
    Zhang, Qi
    [J]. ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 967 - 972
  • [8] Memory-Efficient CNN Accelerator Based on Interlayer Feature Map Compression
    Shao, Zhuang
    Chen, Xiaoliang
    Du, Li
    Chen, Lei
    Du, Yuan
    Zhuang, Wei
    Wei, Huadong
    Xie, Chenjia
    Wang, Zhongfeng
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2022, 69 (02) : 668 - 681
  • [9] Prediction of Inference Energy on CNN Accelerators Supporting Approximate Circuits
    Pinos, Michal
    Mrazek, Vojtech
    Sekanina, Lukas
    [J]. 2023 26TH INTERNATIONAL SYMPOSIUM ON DESIGN AND DIAGNOSTICS OF ELECTRONIC CIRCUITS AND SYSTEMS, DDECS, 2023, : 45 - 50
  • [10] Data Locality Optimization of Depthwise Separable Convolutions for CNN Inference Accelerators
    Wu, Hao-Ning
    Huang, Chih-Tsun
    [J]. 2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 120 - 125