Kernel-as-a-Service: A Serverless Programming Model for Heterogeneous Hardware Accelerators

被引:0
|
作者
Pfandzelter, Tobias [1 ,2 ]
Dhakal, Aditya [3 ]
Frachtenberg, Eitan [3 ]
Chalamalasetti, Sai Rahul [3 ]
Emmot, Darel [3 ]
Hogade, Ninad [4 ]
Enriquez, Rolando Pablo Hong [5 ]
Rattihalli, Gourav [3 ]
Bermbach, David [1 ,2 ]
Milojicic, Dejan [3 ]
机构
[1] TU Berlin, Berlin, Germany
[2] ECDF, Berlin, Germany
[3] Hewlett Packard Labs, Milpitas, CA USA
[4] Hewlett Packard Labs, Ft Collins, CO USA
[5] Hewlett Packard Labs, London, England
关键词
Serverless; Accelerators; Heterogeneity; DATA ANALYTICS; HPC;
D O I
10.1145/3590140.3629115
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the slowing of Moore's law and decline of Dennard scaling, computing systems increasingly rely on specialized hardware accelerators in addition to general-purpose compute units. Increased hardware heterogeneity necessitates disaggregating applications into workflows of fine-grained tasks that run on a diverse set of CPUs and accelerators. Current accelerator delivery models cannot support such applications efficiently, as (1) the overhead of managing accelerators erases performance benefits for fine-grained tasks; (2) exclusive accelerator use per task leads to underutilization; and (3) specialization increases complexity for developers. We propose adopting concepts from Function-as-a-Service (FaaS), which has solved these challenges for general-purpose CPUs in cloud computing. Kernel-as-a-Service (KaaS) is a novel serverless programming model for generic compute accelerators that aids heterogeneous workflows by combining the ease-of-use of higher-level abstractions with the performance of low-level hand-tuned code. We evaluate KaaS with a focus on the breadth of the idea and its generality to diverse architectures rather than on an in-depth implementation for a single accelerator. Using proof-of-concept prototypes, we show that this programming model provides performance, performance efficiency, and ease-of-use benefits across a diverse range of compute accelerators. Despite increased levels of abstraction, when compared to a naive accelerator implementation, KaaS reduces completion times for fine-grained tasks by up to 96.0% (GPU), 68.4% (FPGA), 98.6% (TPU), and 34.9% (QPU) in our experiments.
引用
收藏
页码:192 / 206
页数:15
相关论文
共 50 条
  • [1] Serverless Programming (Function as a Service)
    Castro, Paul
    Ishakian, Vatche
    Muthusamy, Vinod
    Slominski, Aleksander
    [J]. 2017 IEEE 37TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2017), 2017, : 2658 - 2659
  • [2] HARDLESS: A Generalized Serverless Compute Architecture for Hardware Processing Accelerators
    Werner, Sebastian
    Schirmer, Trever
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING (IC2E 2022), 2022, : 79 - 84
  • [3] Exocompilation for Productive Programming of Hardware Accelerators
    Ikarashi, Yuka
    Bernstein, Gilbert Louis
    Reinking, Alex
    Genc, Hasan
    Ragan-Kelley, Jonathan
    [J]. PROCEEDINGS OF THE 43RD ACM SIGPLAN INTERNATIONAL CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '22), 2022, : 703 - 718
  • [4] Hardware accelerators for Cartesian genetic programming
    Vasicek, Zdenek
    Sekanina, Lukas
    [J]. GENETIC PROGRAMMING, PROCEEDINGS, 2008, 4971 : 230 - +
  • [5] Heterogeneous Hardware Accelerators Interconnect: An Overview
    Cuong Pham-Quoc
    Al-Ars, Zaid
    Bertels, Koen
    [J]. 2013 NASA/ESA CONFERENCE ON ADAPTIVE HARDWARE AND SYSTEMS (AHS), 2013, : 189 - 195
  • [6] Hybrid Interconnect Design for Heterogeneous Hardware Accelerators
    Cuong Pham-Quoc
    Heisswolf, Jan
    Werner, Stephan
    Al-Ars, Zaid
    Becker, Juergn
    Bertels, Koen
    [J]. DESIGN, AUTOMATION & TEST IN EUROPE, 2013, : 843 - 846
  • [7] Hardware synthesis for reconfigurable heterogeneous pipelined accelerators
    Jozwiak, Lech
    Douglas, Alexander
    [J]. PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: NEW GENERATIONS, 2008, : 1123 - 1130
  • [8] Characterization and Optimization of Behavioral Hardware Accelerators in Heterogeneous MPSoCs
    Liu, Yidi
    Villaverde, Monica
    Moreno, Felix
    Schafer, Benjamin Carrion
    [J]. 2017 12TH INTERNATIONAL SYMPOSIUM ON RECONFIGURABLE COMMUNICATION-CENTRIC SYSTEMS-ON-CHIP (RECOSOC), 2017,
  • [9] LogCA: A Performance Model for Hardware Accelerators
    Bin Altaf, Muhammad Shoaib
    Wood, David A.
    [J]. IEEE COMPUTER ARCHITECTURE LETTERS, 2015, 14 (02) : 132 - 135
  • [10] Programming heterogeneous clusters with accelerators using object-based programming
    Kunzman, David M.
    Kale, Laxmikant V.
    [J]. SCIENTIFIC PROGRAMMING, 2011, 19 (01) : 47 - 62