Accelerating Wavelet Lifting on Graphics Hardware Using CUDA

被引:81
|
作者
van der Laan, Wladimir J. [1 ]
Jalba, Andrei C. [2 ]
Roerdink, Jos B. T. M. [1 ]
机构
[1] Univ Groningen, Johann Bernoulli Inst Math & Comp Sci, NL-9700 AK Groningen, Netherlands
[2] Eindhoven Univ Technol, Inst Math & Comp Sci, NL-5600 MB Eindhoven, Netherlands
基金
美国国家科学基金会;
关键词
Discrete wavelet transform; wavelet lifting; graphics hardware; CUDA; TRANSFORM; IMPLEMENTATION;
D O I
10.1109/TPDS.2010.143
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The Discrete Wavelet Transform (DWT) has a wide range of applications from signal processing to video and image compression. We show that this transform, by means of the lifting scheme, can be performed in a memory and computation-efficient way on modern, programmable GPUs, which can be regarded as massively parallel coprocessors through NVidia's CUDA compute paradigm. The three main hardware architectures for the 2D DWT (row-column, line-based, block-based) are shown to be unsuitable for a CUDA implementation. Our CUDA-specific design can be regarded as a hybrid method between the row-column and block-based methods. We achieve considerable speedups compared to an optimized CPU implementation and earlier non-CUDA-based GPU DWT methods, both for 2D images and 3D volume data. Additionally, memory usage can be reduced significantly compared to previous GPU DWT methods. The method is scalable and the fastest GPU implementation among the methods considered. A performance analysis shows that the results of our CUDA-specific design are in close agreement with our theoretical complexity analysis.
引用
收藏
页码:132 / 146
页数:15
相关论文
共 50 条
  • [1] Accelerating Wavelet-Based Video Coding on Graphics Hardware using CUDA
    van der Laan, Wladimir J.
    Roerdink, Jos B. T. M.
    Jalba, Andrei C.
    [J]. 2009 PROCEEDINGS OF 6TH INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS (ISPA 2009), 2009, : 614 - +
  • [2] CUDA-BLASTP: Accelerating BLASTP on CUDA-Enabled Graphics Hardware
    Liu, Weiguo
    Schmidt, Bertil
    Mueller-Wittig, Wolfgang
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2011, 8 (06) : 1678 - 1684
  • [3] CUDA-MAFFT: Accelerating MAFFT on CUDA-Enabled Graphics Hardware
    Zhu, Xiangyuan
    Li, Kenli
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2013,
  • [4] Accelerating Adaptive Directional Lifting Based Wavelet Decomposition on GPU Using CUDA
    Chen Jiazhong
    Ju Zengwei
    Cao Hua
    Xia Tao
    Dai Yingying
    Wang Ning
    Xie Ping
    Qin Leihua
    [J]. PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 413 - +
  • [5] Accelerating Vector Graphics Rendering using the Graphics Hardware Pipeline
    Batra, Vineet
    Kilgard, Mark J.
    Kumar, Harish
    Lorach, Tristan
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2015, 34 (04):
  • [6] Design of a Parallel AES for Graphics Hardware using the CUDA framework
    Di Biagio, Andrea
    Barenghi, Alessandro
    Agosta, Giovanni
    Pelosi, Gerardo
    [J]. 2009 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-5, 2009, : 3139 - +
  • [7] Accelerating molecular dynamics simulations using Graphics Processing Units with CUDA
    Liu, Weiguo
    Schmidt, Bertil
    Voss, Gerrit
    Mueller-Wittig, Wolfgang
    [J]. COMPUTER PHYSICS COMMUNICATIONS, 2008, 179 (09) : 634 - 641
  • [8] Optimizing a Semantic Comparator using CUDA-enabled Graphics Hardware
    Tripathy, Aalap
    Mohan, Suneil
    Mahapatra, Rabi
    [J]. FIFTH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2011), 2011, : 125 - 132
  • [9] Accelerating FCM neural network classifier using graphics processing units with CUDA
    Wang, Lin
    Yang, Bo
    Chen, Yuehui
    Chen, Zhenxiang
    Sun, Hongwei
    [J]. APPLIED INTELLIGENCE, 2014, 40 (01) : 143 - 153
  • [10] Accelerating FCM neural network classifier using graphics processing units with CUDA
    Lin Wang
    Bo Yang
    Yuehui Chen
    Zhenxiang Chen
    Hongwei Sun
    [J]. Applied Intelligence, 2014, 40 : 143 - 153