P4GPU: Acceleration of Programmable Data Plane Using a CPU-GPU Heterogeneous Architecture

被引：0

作者：

Li, Peilong ^{[1
]}

Luo, Yan ^{[1
]}

机构：

[1] Univ Massachusetts Lowell, Dept Elect & Comp Engn, Lowell, MA 01852 USA

来源：

2016 IEEE 17TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE SWITCHING AND ROUTING (HPSR) | 2016年

关键词：

Programmable Data Plane; Heterogeneous Architecture; Packet Processing; P4; IP LOOKUP;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The programmability of the network data plane has become one of the most desirable features within the context of software defined networks, with P4 serving as a domain-specific language for defining data plane processing. In this work, we are motivated to address the challenges of mapping a P4 defined data plane to a heterogeneous programmable hardware architecture consisting of both a CPU and a GPU, which includes a salient parallel SIMD architecture for processing network flows. We first design a toolset that can be used to map a P4 program onto the proposed architecture. We then optimize the GPU kernel designs for "match-action" primitives and present latency-hiding techniques to reduce the overheads of CPU/GPU communication. In addition, load balancing is investigated to maximize the utilization of CPU and GPU resources. Our toolset and optimizations allow a P4 program to render promising performance on the given heterogeneous architecture. Specifically, the experimental results collected on our prototype systems show that the automatically configured GPU kernels achieve scalable lookup and classification speeds with 420 million IP lookups per second, and more than 60 million classifications per second (for 4K firewall rules).

引用

页码：168 / 175

页数：8

共 50 条

[31] A Sample-Based Dynamic CPU and GPU LLC Bypassing Method for Heterogeneous CPU-GPU Architectures
Wang, Xin
Zhang, Wei
2017 16TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS / 11TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING / 14TH IEEE INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS, 2017, : 753 - 760
[32] Performance models for CPU-GPU data transfers
van Werkhoven, B.
Maassen, J.
Seinstra, F. J.
Bal, H. E.
2014 14TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2014, : 11 - 20
[33] Reducing CPU-GPU Interferences to Improve CPU Performance in Heterogeneous Architectures
Wen H.
Zhang W.
Journal of Computing Science and Engineering, 2020, 16 (04) : 131 - 145
[34] Using Criticality of GPU Accesses in Memory Management for CPU-GPU Heterogeneous Multi-Core Processors
Rai, Siddharth
Chaudhuri, Mainak
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2017, 16
[35] Transparent CPU-GPU Collaboration for Data-Parallel Kernels on Heterogeneous Systems
Lee, Janghaeng
Samadi, Mehrzad
Park, Yongjun
Mahlke, Scott
2013 22ND INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT), 2013, : 245 - 255
[36] Design space exploration of on-chip ring interconnection for a CPU-GPU heterogeneous architecture
Lee, Jaekyu
Li, Si
Kim, Hyesoon
Yalamanchili, Sudhakar
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (12) : 1525 - 1538
[37] Set Variation-aware Shared LLC Management for CPU-GPU Heterogeneous Architecture
Li, Zhaoying
Ju, Lei
Dai, Hongjun
Li, Xin
Zhao, Mengying
Jia, Zhiping
PROCEEDINGS OF THE 2018 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2018, : 79 - 84
[38] WAP: the warp feature aware prefetching method for LLC on CPU-GPU heterogeneous architecture
Wu, Minghui
Pei, Yulong
Yu, Licheng
Chen, Tianzhou
Lou, Xueqing
Zhang, Tiefei
PROCEEDINGS OF 2016 IEEE 18TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS; IEEE 14TH INTERNATIONAL CONFERENCE ON SMART CITY; IEEE 2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2016, : 414 - 421
[39] TAP: A TLP-Aware Cache Management Policy for a CPU-GPU Heterogeneous Architecture
Lee, Jaekyu
Kim, Hyesoon
2012 IEEE 18TH INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2012, : 91 - 102
[40] SCALABLE HETEROGENEOUS CPU-GPU COMPUTATIONS FOR UNSTRUCTURED TETRAHEDRAL MESHES
Langguth, Johannes
Sourouri, Mohammed
Lines, Glenn Terje
Baden, Scott B.
Cai, Xing
IEEE MICRO, 2015, 35 (04) : 6 - 15

← 1 2 3 4 5 →