A Efficient Parallel Deblocking Filter Based on GPU: Implementation and Optimization

被引:0
|
作者
Su, Huayou [1 ]
Zhang, Chunyuan [1 ]
Chai, Jun [1 ]
Yang, Qianming [1 ]
机构
[1] Natl Univ Def Technol, Sch Comp, Changsha, Hunan, Peoples R China
关键词
Deblocking filter; parallel processing; GPU; H.264/AVC;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The deblocking filter represents one of the most time consuming tasks of the H.264/AVC standard. Due to its characteristics of data dependencies and frequent memory access, it poses an arduous challenge to mapping the algorithm onto massively parallel architecture efficiently. In this paper, a novel parallel deblocking filter is proposed based on GPU, which weaken the dependencies between MBs by rearrange the filter orders of boundaries. We implemented the proposed algorithm on GPU and optimized the program through three strategies, including kernel combination, reusing the intermediate data and optimizing data representation. Experimental results show that applying the proposed parallel method supports real-time processing throughput for 1080p at 450fps. We have also observed 3.78x and 16.68x speedup for comprehensive optimization parallel deblocking filter on two-core processor and the state-of-the-art GPU-based implementation, respectively.
引用
收藏
页码:280 / 285
页数:6
相关论文
共 50 条
  • [11] Efficient Parallel Implementation of Morphological Operation on GPU and FPGA
    Li, Teng
    Dou, Yong
    Jiang, Jingfei
    Gao, Jing
    [J]. 2014 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), 2014, : 430 - 435
  • [12] Optimization of the deblocking filter in h.264 codec for real time implementation
    Yadav, Hitesh
    Rao, K. R.
    [J]. 2006 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES,VOLS 1-3, 2006, : 117 - +
  • [13] Subword Parallel Conditional Execution in H-264/AVC Deblocking Filter Implementation
    Sihvo, Tero
    [J]. ICSES 2008 INTERNATIONAL CONFERENCE ON SIGNALS AND ELECTRONIC SYSTEMS, CONFERENCE PROCEEDINGS, 2008, : 411 - 414
  • [14] An Efficient Deblocking Filter Algorithm for HEVC
    Kang Runlong
    Zhou Wei
    Huang Xiaodong
    Dong BingChao
    [J]. 2014 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (CHINASIP), 2014, : 379 - 383
  • [15] Accelerating IEEE 1857 Deblocking Filter on GPU Using CUDA
    Sun, Xiaoou
    Wang, Ronggang
    [J]. 2015 1ST IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2015, : 415 - 419
  • [16] A GPU-Based Parallel Reduction Implementation
    Rfaei Jradi, Walid Abdala
    Dantas do Nascimento, Hugo Alexandre
    Martins, Wellington Santos
    [J]. HIGH PERFORMANCE COMPUTING SYSTEMS, WSCAD 2018, 2020, 1171 : 168 - 182
  • [17] A novel parallel deblocking filtering strategy for HEVC/H.265 based on GPU
    Jiang, Wenbin
    Mei, Hongyan
    Lu, Feng
    Jin, Hai
    Yang, Laurence T.
    Luo, Bin
    Chi, Ye
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (16): : 4264 - 4276
  • [18] An Efficient Graph Isomorphism Algorithm Based on Canonical Labeling and Its Parallel Implementation on GPU
    Wang, Renda
    Guo, Longjiang
    Ai, Chunyu
    Li, Jinbao
    Ren, Meirui
    Li, Keqin
    [J]. 2013 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2013 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (HPCC_EUC), 2013, : 1089 - 1096
  • [19] An efficient approach for generating pencil filter and its implementation on GPU
    Me, Dang-en
    Zhao, Yang
    Xu, Dan
    [J]. PROCEEDINGS OF 2007 10TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN AND COMPUTER GRAPHICS, 2007, : 185 - +
  • [20] BER Guaranteed Optimization and Implementation of Parallel Turbo Decoding on GPU
    Chen, Xiang
    Zhu, Ji
    Wen, Ziyu
    Wang, Yu
    Yang, Huazhong
    [J]. 2013 8TH INTERNATIONAL ICST CONFERENCE ON COMMUNICATIONS AND NETWORKING IN CHINA (CHINACOM), 2013, : 183 - 188