Tetris: Re-architecting Convolutional Neural Network Computation for Machine Learning Accelerators

被引:29
|
作者
Lu, Hang [1 ,2 ]
Wei, Xin [2 ]
Lin, Ning [2 ]
Yan, Guihai [1 ,2 ]
Li, Xiao-Wei [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1145/3240765.3240855
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Inference efficiency is the predominant consideration in designing deep learning accelerators. Previous work mainly focuses on skipping zero values to deal with remarkable ineffectual computation, while zero bits in non-zero values, as another major source of ineffectual computation, is often ignored. The reason lies on the difficulty of extracting essential bits during operating multiply-and-accumulate (MAC) in the processing element. Based on the fact that zero bits occupy as high as 68.9% fraction in the overall weights of modern deep convolutional neural network models, this paper firstly proposes a weight kneading technique that could eliminate ineffectual computation caused by either zero value weights or zero bits in non-zero weights, simultaneously. Besides, a split-and-accumulate (SAC) computing pattern in replacement of conventional MAC, as well as the corresponding hardware accelerator design called Tetris are proposed to support weight kneading at the hardware level. Experimental results prove that Tetris could speed up inference up to 1.50x, and improve power efficiency up to 5.33x compared with the state-of-the-art baselines.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Forward Learning Convolutional Neural Network
    Hu, Hong
    Hong, Xin
    Hou, Dan Yang
    Shi, Zhongzhi
    INTELLIGENT INFORMATION PROCESSING IX, 2018, 538 : 51 - 61
  • [32] Learning Pooling for Convolutional Neural Network
    Sun, Manli
    Song, Zhanjie
    Jiang, Xiaoheng
    Pan, Jing
    Pang, Yanwei
    NEUROCOMPUTING, 2017, 224 : 96 - 104
  • [33] Extended Bit-Plane Compression for Convolutional Neural Network Accelerators
    Cavigelli, Lukas
    Benini, Luca
    2019 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2019), 2019, : 279 - 283
  • [34] Parallel Convolutional Neural Network (CNN) Accelerators Based on Stochastic Computing
    Zhang, Yawen
    Zhang, Xinyue
    Song, Jiahao
    Wang, Yuan
    Huang, Ru
    Wang, Runsheng
    PROCEEDINGS OF THE 2019 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2019), 2019, : 19 - 24
  • [35] CNNWire: Boosting Convolutional Neural Network with Winograd on ReRAM based Accelerators
    Lin, Jilan
    Li, Shuangchen
    Hu, Xing
    Deng, Lei
    Xie, Yuan
    GLSVLSI '19 - PROCEEDINGS OF THE 2019 ON GREAT LAKES SYMPOSIUM ON VLSI, 2019, : 283 - 286
  • [36] Spatial Data Dependence Graph Simulator for Convolutional Neural Network Accelerators
    Wang, Jooho
    Kim, Jiwon
    Moon, Sungmin
    Kim, Sunwoo
    Park, Sungkyung
    Park, Chester Sungchung
    2019 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2019), 2019, : 309 - 310
  • [37] Hardware Accelerators for a Convolutional Neural Network in Condition Monitoring of CNC Machines
    Hoyer, Ingo
    Berg, Oscar
    Krupp, Lukas
    Utz, Alexander
    Wiede, Christian
    Seidl, Karsten
    2023 IEEE SENSORS, 2023,
  • [38] A Feature Map Lossless Compression Framework for Convolutional Neural Network Accelerators
    Zhang, Zekun
    Jiao, Xin
    Xu, Chengyu
    2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024, 2024, : 422 - 426
  • [39] An efficient loop tiling framework for convolutional neural network inference accelerators
    Huang, Hongmin
    Hu, Xianghong
    Li, Xueming
    Xiong, Xiaoming
    IET CIRCUITS DEVICES & SYSTEMS, 2022, 16 (01) : 116 - 123
  • [40] Exploiting Variable Precision Computation Array for Scalable Neural Network Accelerators
    Yang, Shaofei
    Liu, Longjun
    Li, Baoting
    Sun, Hongbin
    Zheng, Nanning
    2020 2ND IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2020), 2020, : 315 - 319