High-Throughput Lossless Compression on Tightly Coupled CPU-FPGA Platforms

被引:31
|
作者
Qiao, Weikang [1 ]
Du, Jieqiong [1 ]
Fang, Zhenman [1 ,2 ]
Lo, Michael [1 ]
Chang, Mau-Chung Frank [1 ]
Cong, Jason [1 ]
机构
[1] Univ Calif Los Angeles, Ctr Domain Specif Comp, Los Angeles, CA 90024 USA
[2] Xilinx, San Jose, CA USA
关键词
D O I
10.1109/FCCM.2018.00015
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data compression techniques have been widely used to reduce data storage and movement overhead, especially in the big data era. While FPGAs are well suited to accelerate the computation-intensive lossless compression algorithms, big data compression with parallel requests intrinsically poses two challenges to the overall system throughput. First, scaling existing single-engine FPGA compression accelerator designs already encounters bottlenecks which will result in lower clock frequency, saturated throughput and lower area efficiency. Second, when such FPGA compression accelerators are integrated with the processors, the overall system throughput is typically limited by the communication between a CPU and an FPGA. We propose a novel multi-way parallel and fully pipelined architecture to achieve high-throughput lossless compression on modern Intel-Altera HARPv2 platforms. To compensate for the compression ratio loss in a multi-way design, we implement novel techniques, such as a better data feeding method and a hash chain to increase the hash dictionary history. Our accelerator kernel itself can achieve a compression throughput of 12.8 GB/s (2.3x better than the current record throughput) and a comparable compression ratio of 2.03 for standard benchmark data. Our approach enables design scalability without a reduction in clock frequency and also improves the performance per area efficiency (up to 1.5x). Moreover, we exploit the high CPU-FPGA communication bandwidth of HARPv2 platforms to improve the compression throughput of the overall system, which can achieve an average practical endto-end throughput of 10.0 GB/s (up to 12 GB/s for larger input files) on HARPv2.
引用
收藏
页码:37 / 52
页数:16
相关论文
共 50 条
  • [1] A short-transfer model for tightly-coupled CPU-FPGA platforms
    Kroh, Alexander
    Diessel, Oliver
    [J]. 2018 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT 2018), 2018, : 369 - 372
  • [2] Accelerating NFV Application using CPU-FPGA Tightly Coupled Architecture
    Watanabe, Yoshikazu
    Kobayashi, Yuki
    Takenaka, Takashi
    Hosomi, Takeo
    Nakamura, Yuichi
    [J]. 2017 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY (ICFPT), 2017, : 136 - 143
  • [3] Acceleration of Full-PIC simulation on a CPU-FPGA tightly coupled environment
    Sakai, Ryotaro
    Sugimoto, Naru
    Amano, Hideharu
    Miyajima, Takaaki
    Fujita, Naoyuki
    [J]. 2016 IEEE 10TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC), 2016, : 8 - 14
  • [4] HybriDC: A Resource-Efficient CPU-FPGA Heterogeneous Acceleration System for Lossless Data Compression
    Liu, Puguang
    Wei, Ziling
    Yu, Chuan
    Chen, Shuhui
    [J]. MICROMACHINES, 2022, 13 (11)
  • [5] High Throughput Large Scale Sorting on a CPU-FPGA Heterogeneous Platform
    Zhang, Chi
    Chen, Ren
    Prasanna, Viktor
    [J]. 2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 148 - 155
  • [6] A Quantitative Analysis on Microarchitectures of Modern CPU-FPGA Platforms
    Choi, Young-Kyu
    Cong, Jason
    Fang, Zhenman
    Hao, Yuchen
    Reinman, Glenn
    Wei, Peng
    [J]. 2016 ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2016,
  • [7] High-Throughput, Lossless Data Compression on FPGAs
    Sukhwani, Bharat
    Abali, Bulent
    Brezzo, Bernard
    Asaad, Sameh
    [J]. 2011 IEEE 19TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2011, : 113 - 116
  • [8] Cooperative Software-hardware Acceleration of K-means on a Tightly Coupled CPU-FPGA System
    Abdelrahman, Tarek S.
    [J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2020, 17 (03)
  • [9] Accelerating Proximal Policy Optimization on CPU-FPGA Heterogeneous Platforms
    Meng, Yuan
    Kuppannagari, Sanmukh
    Prasanna, Viktor
    [J]. 28TH IEEE INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2020, : 19 - 27
  • [10] Accelerating Real-Valued FFT on CPU-FPGA Platforms
    Qian, Zhuo
    Gan, Guoyou
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (08) : 2532 - 2536