P4GPU: Acceleration of Programmable Data Plane Using a CPU-GPU Heterogeneous Architecture

被引:0
|
作者
Li, Peilong [1 ]
Luo, Yan [1 ]
机构
[1] Univ Massachusetts Lowell, Dept Elect & Comp Engn, Lowell, MA 01852 USA
关键词
Programmable Data Plane; Heterogeneous Architecture; Packet Processing; P4; IP LOOKUP;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The programmability of the network data plane has become one of the most desirable features within the context of software defined networks, with P4 serving as a domain-specific language for defining data plane processing. In this work, we are motivated to address the challenges of mapping a P4 defined data plane to a heterogeneous programmable hardware architecture consisting of both a CPU and a GPU, which includes a salient parallel SIMD architecture for processing network flows. We first design a toolset that can be used to map a P4 program onto the proposed architecture. We then optimize the GPU kernel designs for "match-action" primitives and present latency-hiding techniques to reduce the overheads of CPU/GPU communication. In addition, load balancing is investigated to maximize the utilization of CPU and GPU resources. Our toolset and optimizations allow a P4 program to render promising performance on the given heterogeneous architecture. Specifically, the experimental results collected on our prototype systems show that the automatically configured GPU kernels achieve scalable lookup and classification speeds with 420 million IP lookups per second, and more than 60 million classifications per second (for 4K firewall rules).
引用
收藏
页码:168 / 175
页数:8
相关论文
共 50 条
  • [1] P4GPU: Accelerate Packet Processing of a P4 Program with a CPU-GPU Heterogeneous Architecture
    Li, Peilong
    Luo, Yan
    PROCEEDINGS OF THE 2016 SYMPOSIUM ON ARCHITECTURES FOR NETWORKING AND COMMUNICATIONS SYSTEMS (ANCS'16), 2016, : 125 - 126
  • [2] Heterogeneous Cache Hierarchy Management for Integrated CPU-GPU Architecture
    Wen, Hao
    Zhang, Wei
    2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
  • [3] gem5-gpu: A Heterogeneous CPU-GPU Simulator
    Power, Jason
    Hestness, Joel
    Orr, Marc S.
    Hill, Mark D.
    Wood, David A.
    IEEE COMPUTER ARCHITECTURE LETTERS, 2015, 14 (01) : 34 - 36
  • [4] GFlink: An In-Memory Computing Architecture on Heterogeneous CPU-GPU Clusters for Big Data
    Chen, Cen
    Li, Kenli
    Ouyang, Aijia
    Tang, Zhuo
    Li, Keqin
    PROCEEDINGS 45TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING - ICPP 2016, 2016, : 542 - 551
  • [5] GFlink: An In-Memory Computing Architecture on Heterogeneous CPU-GPU Clusters for Big Data
    Chen, Cen
    Li, Kenli
    Ouyang, Aijia
    Zeng, Zeng
    Li, Keqin
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2018, 29 (06) : 1275 - 1288
  • [6] Denial of Service in CPU-GPU Heterogeneous Architectures
    Wen, Hao
    Zhang, Wei
    2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,
  • [7] A Survey of CPU-GPU Heterogeneous Computing Techniques
    Mittal, Sparsh
    Vetter, Jeffrey S.
    ACM COMPUTING SURVEYS, 2015, 47 (04)
  • [8] Heterogeneous CPU-GPU Execution of Stencil Applications
    Siklosi, Balint
    Reguly, Istvan Z.
    Mudalige, Gihan R.
    PROCEEDINGS OF 2018 IEEE/ACM INTERNATIONAL WORKSHOP ON PERFORMANCE, PORTABILITY AND PRODUCTIVITY IN HPC (P3HPC 2018), 2018, : 71 - 80
  • [9] Parallel Graph Partitioning on a CPU-GPU Architecture
    Goodarzi, Bahareh
    Burtscher, Martin
    Goswami, Dhrubajyoti
    2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 58 - 66
  • [10] Accelerating MapReduce on a Coupled CPU-GPU Architecture
    Chen, Linchuan
    Huo, Xin
    Agrawal, Gagan
    2012 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2012,