Energy and Computing Assessment of Video Processing Kernels on CPU and FPGA platforms

被引:0
|
作者
Mangrich, Fillipi [2 ]
Foes, Joao Gabriel Firta [2 ]
Correa, Guilherme [1 ]
Seidel, Ismael [2 ]
Grellert, Mateus [3 ]
机构
[1] Fed Univ Pelotas PPGC UFPel, Pelotas, RS, Brazil
[2] Fed Univ Santa Catarina UFSC, Embedded Comp Lab ECL, Florianopolis, SC, Brazil
[3] Fed Univ Rio Grande Do Sul UFRGS, Porto Alegre, RS, Brazil
关键词
video coding; similarity metrics; energy comparison; CPU; FPGA;
D O I
10.1109/SBCCI60457.2023.10261966
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Heterogeneous architectures are becoming increasingly common, allowing the acceleration of smaller modules that compose complex systems. This is specially beneficial when said systems contain mixed data-flow and control-flow algorithms, in which the former can be hardware-optimized whereas the latter can still execute in a CPU. In video encoders, the intra- and interprediction are typical examples of data-flow operations. These steps involve block-matching searches that aim at finding the most similar pair of blocks, one being encoded and one that is generated during prediction. The similarity can be measured in different ways, but the most common ones are the Sum of Absolute Differences (SAD), the Sum of Absolute Transformed Differences (SATD), and the Sum of Squared Differences (SSD). All of these distortion metrics are executed several times for each block being encoded, so reducing the time or energy required to compute them is extremely beneficial. This paper presents a comparison of the energy costs of the SAD and SSD operations on a CPU and on dedicated VLSI designs. The experiments were conducted in an Artix-7 based FPGA component. The VLSI architectures and simulation routines were designed with VHDL, and the software versions were described in C. To optimize throughput and resource utilization, the dedicated units were designed using pipeline and resource sharing when possible. Our results show that, as expected, FPGA has a great gain of energy efficiency over CPU, with power efficiency gains in the range of 100 times.
引用
收藏
页码:89 / 94
页数:6
相关论文
共 50 条
  • [31] A short-transfer model for tightly-coupled CPU-FPGA platforms
    Kroh, Alexander
    Diessel, Oliver
    2018 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT 2018), 2018, : 369 - 372
  • [32] High-Throughput Lossless Compression on Tightly Coupled CPU-FPGA Platforms
    Qiao, Weikang
    Du, Jieqiong
    Fang, Zhenman
    Lo, Michael
    Chang, Mau-Chung Frank
    Cong, Jason
    PROCEEDINGS 26TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2018), 2018, : 37 - 52
  • [33] Scalable Inference of Decision Tree Ensembles: Flexible Design for CPU-FPGA Platforms
    Owaida, Muhsen
    Zhang, Hantian
    Zhang, Ce
    Alonso, Gustavo
    2017 27TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2017,
  • [34] Video filter processing system based on FPGA
    Wang, Kunpeng
    Xu, Zezhong
    DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2007, 14 : 328 - 330
  • [35] Video Processing Toolbox for FPGA Powered Hardware
    Kasik, Vladimir
    Peterek, Tomas
    SOFTWARE AND COMPUTER APPLICATIONS, 2011, 9 : 242 - 246
  • [36] Video Image Processing System Based on FPGA
    Xie Shui-Ying
    Han Cheng-Jiang
    2011 AASRI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INDUSTRY APPLICATION (AASRI-AIIA 2011), VOL 2, 2011, : 334 - 337
  • [37] A Streaming Accelerator for Heterogeneous CPU-FPGA Processing of Graph Applications
    O'Brien, Francis
    Agostini, Matthew
    Abdelrahman, Tarek S.
    2021 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2021, : 26 - 35
  • [38] Hardware Accelerator Implementation on FPGA for Video Processing
    Wong, Kenneth Part Kong
    Yap, VooiVoon
    Teh, Peh Chiong
    2013 IEEE CONFERENCE ON OPEN SYSTEMS (ICOS), 2013, : 47 - 51
  • [39] The "Chimera": An Off-The-Shelf CPU/GPGPU/FPGA Hybrid Computing Platform
    Inta, Ra
    Bowman, David J.
    Scott, Susan M.
    INTERNATIONAL JOURNAL OF RECONFIGURABLE COMPUTING, 2012, 2012 (2012)
  • [40] Analysis on Parallelism Between CPU And GPGPU Processing On Cluster Computing
    Rahim, Mohd Noor Ikhwan Abdul
    Mazalan, Lucyantie
    Adnan, Syed Farid Syed
    2014 IEEE SYMPOSIUM ON COMPUTER APPLICATIONS AND INDUSTRIAL ELECTRONICS (ISCAIE), 2014,