A Hypervisor for Shared-Memory FPGA Platforms

被引:34
|
作者
Ma, Jiacheng [1 ]
Zuo, Gefei [1 ]
Loughlin, Kevin [1 ]
Cheng, Xiaohe [2 ]
Liu, Yanqiang [3 ]
Eneyew, Abel Mulugeta [4 ]
Qi, Zhengwei [3 ]
Kasikci, Baris [1 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
[2] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[3] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[4] Addis Ababa Inst Technol, Addis Ababa, Ethiopia
关键词
OPTIMUS; FPGA; Virtualization; ACCELERATORS;
D O I
10.1145/3373376.3378482
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud providers widely deploy FPGAs as application-specific accelerators for customer use. These providers seek to multiplex their FPGAs among customers via virtualization, thereby reducing running costs. Unfortunately, most virtualization support is confined to FPGAs that expose a restrictive, host-centric programming model in which accelerators cannot issue direct memory accesses (DMAs). The host-centric model incurs high runtime overhead for workloads that exhibit pointer chasing. Thus, FPGAs are beginning to support a shared-memory programming model in which accelerators can issue DMAs. However, virtualization support for shared-memory FPGAs is limited. This paper presents OPTIMUS, the first hypervisor that supports scalable shared-memory FPGA virtualization. OPTIMUS offers both spatial multiplexing and temporal multiplexing to provide efficient and flexible sharing of each accelerator on an FPGA. To share the FPGA-CPU interconnect at a high clock frequency, OPTIMUS implements a multiplexer tree. To isolate each guest's address space, OPTIMUS introduces the technique of page table slicing as a hardware-software co-design. To support preemptive temporal multiplexing, OPTIMUS provides an accelerator preemption interface. We show that OPTIMUS supports eight physical accelerators on a single FPGA and improves the aggregate throughput of twelve real-world benchmarks by 1.98x-7x.
引用
收藏
页码:827 / 844
页数:18
相关论文
共 50 条
  • [21] Processor Assisted Worklist Scheduling for FPGA Accelerated Graph Processing on a Shared-Memory Platform
    Wang, Yu
    Hoe, James C.
    Nurvitadhi, Eriko
    2019 27TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2019, : 136 - 144
  • [22] MEMORY MANAGEMENT FOR PARALLEL TASKS IN SHARED-MEMORY
    LANGENDOEN, KG
    MULLER, HL
    VREE, WG
    LECTURE NOTES IN COMPUTER SCIENCE, 1992, 637 : 165 - 178
  • [23] BOUNDS ON SHARED-MEMORY FOR MUTUAL EXCLUSION
    BURNS, JE
    LYNCH, NA
    INFORMATION AND COMPUTATION, 1993, 107 (02) : 171 - 184
  • [24] A SHARED-MEMORY MULTIPROCESSOR LOGIC SIMULATOR
    BEIHL, G
    EIGHTH ANNUAL INTERNATIONAL PHOENIX CONFERENCE ON COMPUTERS AND COMMUNICATIONS: 1989 CONFERENCE PROCEEDINGS, 1989, : 26 - 28
  • [25] APPLICATIVE PARALLELISM ON A SHARED-MEMORY MULTIPROCESSOR
    OLDEHOEFT, RR
    CANN, DC
    IEEE SOFTWARE, 1988, 5 (01) : 62 - 70
  • [26] SHARED-MEMORY CONTROLLERS LINK PROCESSORS
    AOUIZERAT, R
    MINI-MICRO SYSTEMS, 1983, 16 (11): : 272 - 274
  • [27] Shared-Memory Communication for Containerized Workflows
    Hobson, Tanner
    Yildiz, Orcun
    Nicolae, Bogdan
    Huang, Jian
    Peterka, Tom
    21ST IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2021), 2021, : 123 - 132
  • [28] Scalable Shared-Memory Hypergraph Partitioning
    Gottesbueren, Lars
    Heuer, Tobias
    Sanders, Peter
    Schlag, Sebastian
    2021 PROCEEDINGS OF THE SYMPOSIUM ON ALGORITHM ENGINEERING AND EXPERIMENTS, ALENEX, 2021, : 16 - 30
  • [29] REDUCING CONTENTION IN SHARED-MEMORY MULTIPROCESSORS
    STENSTROM, P
    COMPUTER, 1988, 21 (11) : 26 - 35
  • [30] PVM in a shared-memory industrial multiprocessor
    Appiani, E
    Bologna, M
    Corvi, M
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1995, 919 : 588 - 593