A 3.77TOPS/W Convolutional Neural Network Processor With Priority-Driven Kernel Optimization

被引:18
|
作者
Yue, Jinshan [1 ]
Liu, Yongpan [1 ]
Yuan, Zhe [1 ]
Wang, Zhibo [1 ]
Guo, Qingwei [1 ]
Li, Jinyang [1 ]
Yang, Chengmo [2 ]
Yang, Huazhong [1 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
[2] Univ Delaware, Dept Elect & Comp Engn, Newark, DE 19716 USA
关键词
Convolutional neural network; CNN processor; module-parallel IS; priority-driven kernel optimization; ACCELERATOR;
D O I
10.1109/TCSII.2018.2846698
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Convolutional neural network (CNN) has become very popular in image classification tasks. With the increasing demand on intelligent classification on battery-powered devices, energy-efficient ASICs for CNN are badly needed. While previous CNN ASIC processors support operations of different kernel sizes, they sacrifice efficiency to support flexible convolution operations. In fact, convolution operations with a certain kernel size are dominating in many real-case CNNs. This brief proposes a kernel-optimized architecture for 3 x 3 kernels (KOP3), which are dominating operations in mainstream image classification CNNs. Although KOP3 aims at 3 x 3 kernel operations, it also provides programmability to support arbitrary kernel sizes. KOP3 achieves average energy efficiency of 3.77TOPS/W, which is 4.01x better than the best state-of-the-art CNN ASIC processor.
引用
收藏
页码:277 / 281
页数:5
相关论文
共 33 条
  • [1] A 1.42TOPS/W Deep Convolutional Neural Network Recognition Processor for Intelligent IoE Systems
    Sim, Jaehyeong
    Park, Jun-Seok
    Kim, Minhye
    Bae, Dongmyung
    Choi, Yeongjae
    Kim, Lee-Sup
    [J]. 2016 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2016, 59 : 264 - U366
  • [2] A 1.15-TOPS 6.57-TOPS/W Neural Network Processor for Multi-Scale Object Detection With Reduced Convolutional Operations
    Kawamoto, Reiya
    Taichi, Masakazu
    Kabuto, Masaya
    Watanabe, Daisuke
    Izumi, Shintaro
    Yoshimoto, Masahiko
    Kawaguchi, Hiroshi
    Matsukawa, Go
    Goto, Toshio
    Kojima, Motoshi
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2020, 14 (04) : 634 - 645
  • [3] 7 TOPS/W Cellular Neural Network Processor Core for Intelligent Internet-of-Things
    Villemur, Martin
    Julian, Pedro
    Figliolia, Tomas
    Andreou, Andreas G.
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2020, 67 (07) : 1324 - 1328
  • [4] A 2.6 TOPS/W 16-Bit Fixed-Point Convolutional Neural Network Learning Processor in 65-nm CMOS
    Yin, Shihui
    Seo, Jae-Sun
    [J]. IEEE SOLID-STATE CIRCUITS LETTERS, 2020, 3 (01): : 13 - 16
  • [5] Optimization-Driven Kernel and Deep Convolutional Neural Network for Multi-View Face Video Super Resolution
    Deshmukh, Amar B.
    Rani, N. Usha
    [J]. INTERNATIONAL JOURNAL OF DIGITAL CRIME AND FORENSICS, 2020, 12 (03) : 77 - 95
  • [6] A 1.06-to-5.09 TOPS/W Reconfigurable Hybrid-Neural-Network Processor for Deep Learning Applications
    Yin, Shouyi
    Ouyang, Peng
    Tang, Shibin
    Tu, Fengbin
    Li, Xiudong
    Liu, Leibo
    Wei, Shaojun
    [J]. 2017 SYMPOSIUM ON VLSI CIRCUITS, 2017, : C26 - C27
  • [7] Shape Driven Kernel Adaptation in Convolutional Neural Network for Robust Facial Trait Recognition
    Li, Shaoxin
    Xing, Junliang
    Niu, Zhiheng
    Shan, Shiguang
    Yan, Shuicheng
    [J]. 2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 222 - 230
  • [8] A 28-nm 198.9-TOPS/W Fault-Tolerant Stochastic Computing Neural Network Processor
    Hu, Yixuan
    Zhang, Yawen
    Wang, Runsheng
    Zhang, Zuodong
    Song, Jiahao
    Tang, Xiyuan
    Qian, Weikang
    Wang, Yanzhi
    Wang, Yuan
    Huang, Ru
    [J]. IEEE SOLID-STATE CIRCUITS LETTERS, 2022, 5 : 198 - 201
  • [9] Rider Driven African Vulture Optimization with Multi Kernel Structured Text Convolutional Neural Network for Classifying e-Commerce Reviews
    Zakir, H. Mohamed
    Jinny, S. Vinila
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (07) : 623 - 637
  • [10] ENVISION: A 0.26-to-10TOPS/W Subword-Parallel Dynamic-Voltage-Accuracy-Frequency-Scalable Convolutional Neural Network Processor in 28nm FDSOI
    Moons, Bert
    Uytterhoeven, Roel
    Dehaene, Wim
    Verhelst, Marian
    [J]. 2017 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2017, : 246 - 246