Scalable Light-Weight Integration of FPGA Based Accelerators with Chip Multi-Processors

被引:5
|
作者
Lin, Zhe [1 ]
Sinha, Sharad [2 ]
Liang, Hao [1 ]
Feng, Liang [1 ]
Zhang, Wei [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Elect & Comp Engn, Hong Kong, Hong Kong, Peoples R China
[2] Nanyang Technol Univ, Sch Comp Engn, Singapore 639798, Singapore
关键词
FPGA; hardware accelerator; heterogeneous system; network-on-chip; chip-multiprocessor;
D O I
10.1109/TMSCS.2017.2754378
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modern multicore systems are migrating from homogeneous systems to heterogeneous systems with accelerator-based computing in order to overcome the barriers of performance and power walls. In this trend, FPGA-based accelerators are becoming increasingly attractive, due to their excellent flexibility and low design cost. In this paper, we propose the architectural support for efficient interfacing between FPGA-based multi-accelerators and chip-multiprocessors (CMPs) connected through the network-on-chip (NoC). Distributed packet receivers and hierarchical packet senders are designed to maintain scalability and reduce the critical path delay under a heavy task load. A dedicated accelerator chaining mechanism is also proposed to facilitate intra-FPGA data reuse among accelerators to circumvent prohibitive communication overhead between the FPGA and processors. In order to evaluate the proposed architecture, a complete system emulation with programmability support is performed using FPGA prototyping. Experimental results demonstrate that the proposed architecture has high-performance, and is light-weight and scalable in characteristics.
引用
收藏
页码:152 / 162
页数:11
相关论文
共 50 条
  • [1] A scalable, low cost design-for-test architecture for UltraSPARC™ chip multi-processors
    Parulkar, I
    Ziaja, T
    Pendurkar, R
    D'Souza, A
    Majumdar, A
    INTERNATIONAL TEST CONFERENCE 2002, PROCEEDINGS, 2002, : 726 - 735
  • [2] An Overview of Chip Multi-Processors Simulators Technology
    Al-Manasia, Malik
    Chaczko, Zenon
    PROGRESS IN SYSTEMS ENGINEERING, 2015, 366 : 877 - 884
  • [3] A framework for providing quality of service in chip multi-processors
    Guo, Fei
    Solihin, Yan
    Zhao, Li
    Iyer, Ravishankar
    MICRO-40: PROCEEDINGS OF THE 40TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2007, : 343 - +
  • [4] Scalable communication architectures for massively parallel hardware multi-processors
    Jan, Yahya
    Jozwiak, Lech
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2012, 72 (11) : 1450 - 1463
  • [5] QoS-supported On-chip Communication for Multi-processors
    Mohammad Abdullah Al Faruque
    Jörg Henkel
    International Journal of Parallel Programming, 2008, 36 : 114 - 139
  • [6] SkipCache: application aware cache management for chip multi-processors
    Warrier, Tripti S.
    Kanakagiri, Raghavendra
    Mutyam, Madhu
    IET COMPUTERS AND DIGITAL TECHNIQUES, 2015, 9 (06): : 293 - 299
  • [7] Exploiting parallelism and structure to accelerate the simulation of chip multi-processors
    Penry, David A.
    Fay, Daniel
    Hodgdon, David
    Wells, Ryan
    Schelle, Graham
    August, David I.
    Connors, Dan
    TWELFTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 2006, : 27 - +
  • [8] Data Transfers on the Fly for Hierarchical Systems of Chip Multi-Processors
    Tudruj, Marek
    Masko, Lukasz
    PARALLEL PROCESSING AND APPLIED MATHEMATICS, PT I, 2012, 7203 : 50 - 59
  • [9] A Low-power network-on-chip architecture for tile-based chip multi-processors
    Psarras, Anastasios
    Lee, Junghee
    Mattheakis, Pavlos
    Nicopoulos, Chrysostomos
    Dimitrakopoulos, Giorgos
    Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI, 2016, 18-20-May-2016 : 335 - 340
  • [10] QoS-supported on-chip communication for multi-processors
    Al Faruque, Mohammad Abdullah
    Henkel, Joerg
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2008, 36 (01) : 114 - 139