A Efficient Parallel Deblocking Filter Based on GPU: Implementation and Optimization

被引:0
|
作者
Su, Huayou [1 ]
Zhang, Chunyuan [1 ]
Chai, Jun [1 ]
Yang, Qianming [1 ]
机构
[1] Natl Univ Def Technol, Sch Comp, Changsha, Hunan, Peoples R China
关键词
Deblocking filter; parallel processing; GPU; H.264/AVC;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The deblocking filter represents one of the most time consuming tasks of the H.264/AVC standard. Due to its characteristics of data dependencies and frequent memory access, it poses an arduous challenge to mapping the algorithm onto massively parallel architecture efficiently. In this paper, a novel parallel deblocking filter is proposed based on GPU, which weaken the dependencies between MBs by rearrange the filter orders of boundaries. We implemented the proposed algorithm on GPU and optimized the program through three strategies, including kernel combination, reusing the intermediate data and optimizing data representation. Experimental results show that applying the proposed parallel method supports real-time processing throughput for 1080p at 450fps. We have also observed 3.78x and 16.68x speedup for comprehensive optimization parallel deblocking filter on two-core processor and the state-of-the-art GPU-based implementation, respectively.
引用
收藏
页码:280 / 285
页数:6
相关论文
共 50 条
  • [21] HIGH THROUGHPUT PARALLEL SCHEME FOR HEVC DEBLOCKING FILTER
    Eldeken, Alaa F.
    Dansereau, Richard M.
    Fouad, Mohamed M.
    Salama, Gouda I.
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 1538 - 1542
  • [22] Implementation of parallel power flow calculation based on GPU
    Xia, Jun-Feng
    Yang, Fan
    Li, Jing
    Zheng, Xiu-Yu
    [J]. Dianli Xitong Baohu yu Kongzhi/Power System Protection and Control, 2010, 38 (18): : 100 - 103
  • [23] GPU-based Parallel Implementation of SAR Imaging
    Jin, Xingxing
    Ko, Seok-Bum
    [J]. 2012 INTERNATIONAL SYMPOSIUM ON ELECTRONIC SYSTEM DESIGN (ISED 2012), 2012, : 125 - 129
  • [24] COMPARISON OF DIFFERENT PARALLEL IMPLEMENTATIONS FOR DEBLOCKING FILTER OF HEVC
    Kotra, Anand Meher
    Raulet, Mickael
    Deforges, Olivier
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 2721 - 2725
  • [25] VLSI Design and Implementation of Adaptive Deblocking Filter for HEVC
    Chen, Zhuomiao
    Chen, Zhifeng
    Chen, Jian
    Wang, Jiahua
    [J]. Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2024, 36 (04): : 636 - 644
  • [26] An Efficient Parallel Implementation of an Optimized Simplex Method in GPU-CUDA
    Silva, V. O.
    Ekel, P. Y.
    Martins, C. A. P. S.
    [J]. IEEE LATIN AMERICA TRANSACTIONS, 2018, 16 (02) : 564 - 573
  • [27] Efficient Implementation of Parallel Symmetric Matrix Tridiagonalization Algorithm on GPU Cluster
    Liu, Shifang
    Zhao, Yonghua
    Yu, Tianyu
    Huang, Rongfeng
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2020, 57 (12): : 2635 - 2647
  • [28] An Efficient Implementation of Ant Colony Optimization on GPU for the Satisfiability Problem
    Youness, Hassan
    Ibraheim, Aziza
    Moness, Mohammed
    Osama, Muhammad
    [J]. 23RD EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2015), 2015, : 230 - 235
  • [29] Efficient Parallel Implementation of Active Appearance Model Fitting Algorithm on GPU
    Wang, Jinwei
    Ma, Xirong
    Zhu, Yuanping
    Sun, Jizhou
    [J]. SCIENTIFIC WORLD JOURNAL, 2014,
  • [30] Efficient Parallel Video Processing Techniques on GPU: From Framework to Implementation
    Su, Huayou
    Wen, Mei
    Wu, Nan
    Ren, Ju
    Zhang, Chunyuan
    [J]. SCIENTIFIC WORLD JOURNAL, 2014,