Heterogeneous Edge CNN Hardware Accelerator

被引:0
|
作者
Moudgill, Mayan [1 ]
Glossner, John [1 ,2 ]
Huang, Wei [3 ]
Tian, Chaoyang [3 ]
Xu, Chunxia [3 ]
Yang, Nianliang [3 ]
Wang, Lei [2 ,4 ]
Liang, Tailin [2 ,4 ]
Shi, Shaobo [2 ,4 ]
Zhang, Xiaodong [3 ]
Iancu, Daniel [1 ]
Nacer, Gary [1 ]
Li, Kerry [4 ]
机构
[1] Gen Processor Technol, Tarrytown, NY 10591 USA
[2] Univ Sci & Technol Beijing, Beijing, Peoples R China
[3] Hua Xia Gen Processor Technol, Shanghai, Peoples R China
[4] Hua Xia Gen Processor Technol, Beijing, Peoples R China
关键词
AI Accelerator; Hardware CNN; Deep Neural Network; Heterogeneous Processor; Blind Modulation Detection; CLASSIFICATION;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We describe a programmable and scalable Convolutional Neural Network (CNN) hardware accelerator optimized for mobile and edge inference computing. The accelerator is comprised of 4 heterogeneous engines - input engine, filter engine, post processing engine, and output engine. The specialized engines execute independently and concurrently. All engines have a core set of common instructions with each engine further specialized for specific functions. We describe the operation of each engine and provide silicon validated results for a number of CNN networks including LeNet-5, TinySSD, and SqueezeNet. We describe a blind modulation detection application using SqueezeNet. The accelerator has been fabricated in 28nm CMOS and runs at 1GHz. The logic consumes 0.6 mm(2) and the fully hardened core with 2MB of SRAM including built-in self-test consumes 9.36mm(2). The accelerator's filter engine implements 288 f16 multipliers thereby achieving 288 GFLOPS at 1GHz. Two TOPS of peak performance is achieved with all engines running in parallel. The accelerator including SRAM dissipates 193mW running LeNet-5 at room temperature.
引用
收藏
页码:636 / 641
页数:6
相关论文
共 50 条
  • [1] Hardware Trojan in FPGA CNN Accelerator
    Ye, Jing
    Hu, Yu
    Li, Xiaowei
    2018 IEEE 27TH ASIAN TEST SYMPOSIUM (ATS), 2018, : 68 - 73
  • [2] CoNNA - Compressed CNN Hardware Accelerator
    Struharik, Rastislav
    Vukobratovic, Bogdan
    Erdeljan, Andrea
    Rakanovic, Damjan
    2018 21ST EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD 2018), 2018, : 365 - 372
  • [3] Hardware Accelerator for Edge Detection
    Kurdi, Aous H.
    Grantner, Janos L.
    Abdel-Qader, Ikhlas
    2020 IEEE 24TH INTERNATIONAL CONFERENCE ON INTELLIGENT ENGINEERING SYSTEMS (INES 2020), 2020, : 79 - 84
  • [4] A high-speed reusable quantized hardware accelerator design for CNN on constrained edge device
    Rama Muni Reddy Yanamala
    Muralidhar Pullakandam
    Design Automation for Embedded Systems, 2023, 27 : 165 - 189
  • [5] Hardware-Software Codesign of a CNN Accelerator
    Yi, Changjae
    Kang, Donghyun
    Ha, Soonhoi
    2022 25TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2022, : 348 - 356
  • [6] A high-speed reusable quantized hardware accelerator design for CNN on constrained edge device
    Yanamala, Rama Muni Reddy
    Pullakandam, Muralidhar
    DESIGN AUTOMATION FOR EMBEDDED SYSTEMS, 2023, 27 (03) : 165 - 189
  • [7] HARDWARE ACCELERATOR: IMPLEMENTATION OF CNN ON FPGA FOR DIGIT RECOGNITION
    Choudhari, Onkar
    Chopade, Marisha
    Chopde, Sourabh
    Dabhadkar, Swarali
    Ingale, V
    2020 24TH INTERNATIONAL SYMPOSIUM ON VLSI DESIGN AND TEST (VDAT), 2020,
  • [8] AIScale - A Coarse Grained Reconfigurable CNN Hardware Accelerator
    Struharik, Rastislav
    Vukobratovic, Bogdan
    2017 IEEE EAST-WEST DESIGN & TEST SYMPOSIUM (EWDTS), 2017,
  • [9] A New Hardware-Efficient VLSI-Architecture of GoogLeNet CNN-Model Based Hardware Accelerator for Edge Computing Applications
    Islam, Md. Najrul
    Shrestha, Rahul
    Chowdhury, Shubhajit Roy
    2022 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2022), 2022, : 414 - 417
  • [10] Shared Hardware Accelerator Architectures for Heterogeneous MPSoCs
    Bouthaina, Damak
    Baklouti, Mouna
    Niar, Smail
    Abid, Mohamed
    2013 8TH INTERNATIONAL WORKSHOP ON RECONFIGURABLE AND COMMUNICATION-CENTRIC SYSTEMS-ON-CHIP (RECOSOC), 2013,