A 40nm 24.6TOPS/W Scalable EfficientDet Processor for Object Detection

被引:0
|
作者
Chuang, Yu-Chuan [1 ]
Lin, Ming-Guang [1 ]
Huang, Chi-Tse [1 ]
Teng, Chieh-Feng [1 ]
Chang, Cheng-Yang [1 ]
Chen, Yi-Ta [1 ]
Wu, An-Yeu [1 ]
机构
[1] Natl Taiwan Univ, Grad Inst Elect Engn, Taipei, Taiwan
来源
2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024 | 2024年
关键词
Object detection; EfficientDet; processor; scalable;
D O I
10.1109/ISCAS58744.2024.10558521
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Object detection is a crucial technology used to identify and locate objects in a wide range of applications. Google's EfficientDet, a scalable solution, employs a compound scaling method to systematically adjust the network's depth, width, and input resolution, meeting different resource constraints on edge devices. This paper presents the first dedicated processor for EfficientDet, featuring three key elements: 1) an adaptive channel/input-wise (CIW) mapper to improve hardware utilization by applying distinct mapping strategies for layers with varying data shapes, 2) a tri-mode activation compression (AC) engine to reduce external memory access (EMA) by leveraging the sparsity level of activations, and 3) a unified aggregation core (AggrCore) to flexibly handle different computations. The chip is fabricated using TSMC 40nm CMOS technology and achieves a maximum energy efficiency of 24.6TOPS/W. Compared to the state-of-the-art object detection processor, our chip demonstrates 3.7x and 2.1x improvements in energy and area efficiencies, respectively.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] A Scalable HCI Reliability model for 40nm nMOSFET
    Zhang, Mengdi
    Li, Xi
    Wang, Mingjuan
    Shi, Yanling
    Ren, Zheng
    Hu, Shaojian
    ELECTRONIC INFORMATION AND ELECTRICAL ENGINEERING, 2012, 19 : 67 - 69
  • [2] A 1.15-TOPS 6.57-TOPS/W DNN Processor for Multi-Scale Object Detection
    Kawamoto, Reiya
    Taichi, Masakazu
    Kabuto, Masaya
    Watanabe, Daisuke
    Izumi, Shintaro
    Yoshimoto, Masahiko
    Kawaguchi, Hiroshi
    2020 2ND IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2020), 2020, : 203 - 207
  • [3] A 40nm 5.6TOPS/W 239GOPS/mm2 Self-Attention Processor with Sign Random Projection-based Approximation
    Seo, Seong Hoon
    Kim, Soosung
    Jung, Sung Jun
    Kwon, Sangwoo
    Lee, Hyunseung
    Lee, Jae W.
    ESSCIRC 2022- IEEE 48TH EUROPEAN SOLID STATE CIRCUITS CONFERENCE (ESSCIRC), 2022, : 85 - 88
  • [4] A 1.15-TOPS 6.57-TOPS/W Neural Network Processor for Multi-Scale Object Detection With Reduced Convolutional Operations
    Kawamoto, Reiya
    Taichi, Masakazu
    Kabuto, Masaya
    Watanabe, Daisuke
    Izumi, Shintaro
    Yoshimoto, Masahiko
    Kawaguchi, Hiroshi
    Matsukawa, Go
    Goto, Toshio
    Kojima, Motoshi
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2020, 14 (04) : 634 - 645
  • [5] A 112 μW F-band Standing Wave Detector in 40nm CMOS for Sensing and Impedance Detection
    Philippe, Bart
    Reynaert, Patrick
    2018 13TH EUROPEAN MICROWAVE INTEGRATED CIRCUITS CONFERENCE (EUMIC), 2018, : 21 - 24
  • [6] A 40nm 1Mb 35.6 TOPS/W MLC NOR-Flash based Computation-in-Memory Structure for Machine Learning
    Zhang, Yuxin
    Zeng, Sitao
    Zhu, Zhiguo
    Qin, Zhaolong
    Wang, Chen
    Li, Jingjing
    Zhang, Sanfeng
    He, Yajuan
    Dou, Chunmeng
    Si, Xin
    Chang, Meng-Fan
    Li, Qiang
    2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [7] A 40nm 2TOPS/W Depth-Completion Neural Network Accelerator SoC With Efficient Depth Engine for Realtime LiDAR Systems
    Sun, Miao
    Cao, Yingjie
    Qian, Jian
    Li, Jie
    Zhou, Sifan
    Zhao, Ziyu
    Wu, Yifan
    Xia, Tao
    Qin, Yajie
    Qiu, Lei
    Ma, Shunli
    Chiang, Patrick Yin
    Zhuo, Shenglong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2023, 70 (05) : 1704 - 1708
  • [8] A 2.5GHz 7.7TOPS/W Switched-Capacitor Matrix Multiplier with Co-designed Local Memory in 40nm
    Lee, Edward H.
    Wong, S. Simon
    2016 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2016, 59 : 418 - U587
  • [9] A 24.1 TOPS/W mixed-signal BNN processor in 28-nm CMOS
    Kim, Hanseul
    Park, Jongmin
    Lee, Hyunbae
    Yang, Hyeokjoon
    Burm, Jinwook
    INTERNATIONAL JOURNAL OF ELECTRONICS, 2024, 111 (08) : 1288 - 1300
  • [10] A Compact 446 Gbps/W AES accelerator for Mobile SoC and IoT in 40nm
    Zhang, Yiqun
    Yang, Kaiyuan
    Saligane, Mehdi
    Blaauw, David
    Sylvester, Dennis
    2016 IEEE SYMPOSIUM ON VLSI CIRCUITS (VLSI-CIRCUITS), 2016,