Architecture and Application Co-Design for Beyond-FPGA Reconfigurable Acceleration Devices

被引:5
|
作者
Boutros, Andrew [1 ,2 ]
Nurvitadhi, Eriko [2 ]
Betz, Vaughn [1 ]
机构
[1] Univ Toronto, Dept Elect & Comp Engn, Toronto, ON M5S 3G4, Canada
[2] Intel Corp, Programmable Solut Grp, Santa Clara, CA 95054 USA
基金
加拿大自然科学与工程研究理事会;
关键词
Deep learning; field-programmable gate arrays; hardware acceleration; network-on-chip; reconfigurable computing; EMBEDDED NETWORKS; CHIP;
D O I
10.1109/ACCESS.2022.3204664
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, field-programmable gate arrays (FPGAs) have been increasingly deployed in datacenters as programmable accelerators that can offer software-like flexibility and custom-hardware-like efficiency for key datacenter workloads. To improve the efficiency of FPGAs for these new datacenter use cases and data-intensive applications, a new class of reconfigurable acceleration devices (RADs) is emerging. In these devices, the FPGA fine-grained reconfigurable fabric is a component of a bigger monolithic or multi-die system-in-package that can incorporate general-purpose software-programmable cores, domain-specialized accelerator blocks, and high-performance networks-on-chip (NoCs) for efficient communication between these system components. The integration of all these components in a RAD results in a huge design space and requires re-thinking the implementation of applications that need to be migrated from conventional FPGAs to these novel devices. In this work, we introduce RAD-Sim, an architecture simulator that allows rapid design space exploration for RADs and facilitates the study of complex interactions between their various components. We also present a case study that highlights the utility of RAD-Sim in re-designing applications for these novel RADs by mapping a state-of-the-art deep learning (DL) inference FPGA overlay to different RAD instances. Our case study illustrates how RAD-Sim can capture a wide variety of reconfigurable architectures, from conventional FPGAs to devices augmented with hard NoCs, specialized matrix-vector blocks, and 3D-stacked multi-die devices. In addition, we show that our tool can help architects evaluate the effect of specific RAD architecture parameters on end-to-end workload performance. Through RAD-Sim, we also show that novel RADs can potentially achieve 2.6x better performance on average compared to conventional FPGAs in the key DL application domain.
引用
收藏
页码:95067 / 95082
页数:16
相关论文
共 50 条
  • [41] HW/SW co-design project with FPGA prototyping
    Moreno Zamora, Jose A.
    Valverde Sanchez, Jose V.
    Alvarez Garcia, Francisco J.
    PROCEEDINGS OF 2016 TECHNOLOGIES APPLIED TO ELECTRONICS TEACHING (TAEE 2016), 2016,
  • [42] Hardware-Software Co-Design of AES on FPGA
    Baskaran, Saambhavi
    Rajalakshmi, Pachamuthu
    PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI'12), 2012, : 1118 - 1122
  • [43] Resource-constrained FPGA/DNN co-design
    Zhang, Zhichao
    Kouzani, Abbas Z.
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (21): : 14741 - 14751
  • [44] Resource-constrained FPGA/DNN co-design
    Zhichao Zhang
    Abbas Z. Kouzani
    Neural Computing and Applications, 2021, 33 : 14741 - 14751
  • [45] A Simulink-to-FPGA co-design of encryption module
    Li, Xiaoying
    Sun, Fuming
    Wu, Enhua
    2006 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, 2006, : 2008 - +
  • [46] On Teaching Hardware/Software Co-design using FPGA
    Bencheva, N.
    Kostadinov, N.
    Ruseva, Y.
    ELEKTRONIKA IR ELEKTROTECHNIKA, 2010, (06) : 91 - 94
  • [47] Energy-Efficient Reconfigurable Computing Using a Circuit-Architecture-Software Co-Design Approach
    Paul, Somnath
    Chatterjee, Subho
    Mukhopadhyay, Saibal
    Bhunia, Swarup
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2011, 1 (03) : 369 - 380
  • [48] Hardware-software co-design for dynamic reconfigurable computing with collaborative supports of architecture and operating system
    Wang, Wei
    Wu, Qiang
    Xie, Wei
    PROCEEDINGS OF THE 2007 11TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, VOLS 1 AND 2, 2007, : 275 - +
  • [49] Algorithm and Hardware Co-Design for FPGA Acceleration of Hamiltonian Monte Carlo Based No-U-Turn Sampler
    Wang, Yu
    Li, Peng
    2021 IEEE 32ND INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 2021), 2021, : 9 - 16
  • [50] CELIA: A Device and Architecture Co-Design Framework for STT-MRAM-Based Deep Learning Acceleration
    Yan, Hao
    Cherian, Hebin R.
    Ahn, Ethan C.
    Duan, Lide
    INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS 2018), 2018, : 149 - 159