Offload Annotations: Bringing Heterogeneous Computing to Existing Libraries and Workloads

被引：0

作者：

Yuan, Gina ^{[1
]}

Palkar, Shoumik ^{[1
]}

Narayanan, Deepak ^{[1
]}

Zaharia, Matei ^{[1
]}

机构：

[1] Stanford Univ, Stanford, CA 94305 USA

来源：

PROCEEDINGS OF THE 2020 USENIX ANNUAL TECHNICAL CONFERENCE | 2020年

关键词：

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

As specialized hardware accelerators such as GPUs become increasingly popular, developers are looking for ways to target these platforms with high-level APIs. One promising approach is kernel libraries such as PyTorch or cuML, which provide interfaces that mirror CPU-only counterparts such as NumPy or Scikit-Learn. Unfortunately, these libraries are hard to develop and to adopt incrementally: they only support a subset of their CPU equivalents, only work with datasets that fit in device memory, and require developers to reason about data placement and transfers manually. To address these shortcomings, we present a new approach called offload annotations (OAs) that enables heterogeneous GPU computing in existing workloads with few or no code modifications. An annotator annotates the types and functions in a CPU library with equivalent kernel library functions and provides an offloading API to specify how the inputs and outputs of the function can be partitioned into chunks that fit in device memory and transferred between devices. A runtime then maps existing CPU functions to equivalent GPU kernels and schedules execution, data transfers and paging. In data science workloads using CPU libraries such as NumPy and Pandas, OAs enable speedups of up to 1200x and a median speedup of 6.3x by transparently offloading functions to a GPU using existing kernel libraries. In many cases, OAs match the performance of handwritten heterogeneous implementations. Finally, OAs can automatically page data in these workloads to scale to datasets larger than GPU memory, which would need to be done manually with most current GPU libraries.

引用

页码：293 / 306

页数：14

共 8 条

[1] Performance Benefits of Heterogeneous Computing in HPC Workloads
Lee, Victor W.
Grochowski, Ed
Geva, Robert
2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 16 - 26
[2] Optimizing Data-Intensive Computations in Existing Libraries with Split Annotations
Palkar, Shoumik
Zaharia, Matei
PROCEEDINGS OF THE TWENTY-SEVENTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES (SOSP '19), 2019, : 291 - 305
[3] KEENELAND: BRINGING HETEROGENEOUS GPU COMPUTING TO THE COMPUTATIONAL SCIENCE COMMUNITY
Vetter, Jeffrey S.
Glassbrook, Richard
Dongarra, Jack
Schwan, Karsten
Loftis, Bruce
McNally, Stephen
Meredith, Jeremy
Rogers, James
Roth, Philip
Spafford, Kyle
Yalamanchili, Sudhakar
COMPUTING IN SCIENCE & ENGINEERING, 2011, 13 (05) : 90 - 95
[4] Delay-Optimal Scheduling of VMs in a Queueing Cloud Computing System with Heterogeneous Workloads
Guo, Mian
Guan, Quansheng
Chen, Weiqi
Ji, Fei
Peng, Zhiping
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (01) : 110 - 123
[5] Evaluation of Emerging Energy-Efficient Heterogeneous Computing Platforms for Biomolecular and Cellular Simulation Workloads
Stone, John E.
Hallock, Michael J.
Phillips, James C.
Peterson, Joseph R.
Luthey-Schulten, Zaida
Schulten, Klaus
2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 89 - 100
[6] Performance anomaly detection using isolation-trees in heterogeneous workloads of web applications in computing clouds
Kardani-Moghaddam, Sara
Buyya, Rajkumar
Ramamohanarao, Kotagiri
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (20):
[7] Distributed load balancing in heterogeneous peer-to-peer networks for web computing libraries
Gehweiler, Joachim
Schomaker, Gunnar
DS-RT 2006: TENTH IEEE INTERNATIONAL SYMPOSIUM ON DISTRIBUTED SIMULATION AND REAL-TIME APPLICATIONS, PROCEEDINGS, 2006, : 51 - +
[8] Domain-specific virtual processors as a portable programming and execution model for parallel computational workloads on modern heterogeneous high-performance computing architectures
Lyakh, Dmitry, I
INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY, 2019, 119 (12)

← 1 →