Reducing memory usage by the lifting-based discrete wavelet transform with a unified buffer on a GPU

被引:11
|
作者
Ikuzawa, Takuya [1 ,2 ]
Ino, Fumihiko [1 ]
Hagihara, Kenichi [1 ]
机构
[1] Osaka Univ, Grad Sch Informat Sci & Technol, 1-5 Yamada Oka, Suita, Osaka 5650871, Japan
[2] Toshiba Co Ltd, Ind ICT Solut Co, Saiwai Ku, 72-34 Horikawa Cho, Kawasaki, Kanagawa 2120013, Japan
基金
日本科学技术振兴机构; 日本学术振兴会;
关键词
Discrete wavelet transform; Lifting scheme; Memory-saving computation; In-place algorithm; GPU; IMPLEMENTATION;
D O I
10.1016/j.jpdc.2016.03.010
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this study, to improve the speed of the lifting-based discrete wavelet transform (DWT) for large-scale data, we propose a parallel method that achieves low memory usage and highly efficient memory access on a graphics processing unit (GPU). The proposed method reduces the memory usage by unifying the input buffer and output buffer but at the cost of a working memory region that is smaller than the data size it The method partitions the input data into small chunks, which are then rearranged into groups so different groups of chunks can be processed in parallel. This data rearrangement scheme classifies chunks in terms of data dependency but it also facilitates transformation via simultaneous access to contiguous memory regions, which can be handled efficiently by the GPU. In addition, this data rearrangement is interpreted as a product of circular permutations such that a sequence of seeds, which is an order of magnitude shorter than input data, allows the GPU threads to compute the complicated memory indexes needed for parallel rearrangement. Because the DWT is usually part of a processing pipeline in an application, we believe that the proposed method is useful for retaining the amount of memory for use by other pipeline stages. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:44 / 55
页数:12
相关论文
共 50 条
  • [1] Accelerated implementation of adaptive directional lifting-based discrete wavelet transform on GPU
    Chen, Jiazhong
    Ju, Zengwei
    Hua, Cao
    Ma, Bingpeng
    Chen, Changnian
    Qin, Leihua
    Li, Rong
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2013, 28 (09) : 1202 - 1211
  • [2] Memory Efficient Architecture for Lifting-Based Discrete Wavelet Packet Transform
    Gyanendra
    Chiluveru, Samba Raju
    Raman, Balasubramanian
    Tripathy, Manoj
    Kaushik, Brajesh Kumar
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (04) : 1373 - 1377
  • [3] Novel architectures for the lifting-based discrete wavelet transform
    Liao, HY
    Mandal, MK
    Cockburn, BF
    IEEE CCEC 2002: CANADIAN CONFERENCE ON ELECTRCIAL AND COMPUTER ENGINEERING, VOLS 1-3, CONFERENCE PROCEEDINGS, 2002, : 1020 - 1025
  • [4] A Memory-Efficient Scalable Architecture for Lifting-Based Discrete Wavelet Transform
    Hu, Yusong
    Jong, Ching Chuen
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2013, 60 (08) : 502 - 506
  • [5] Efficient implementation of lifting-based discrete wavelet transform
    Liao, HY
    Mandal, MK
    Cockburn, R
    ELECTRONICS LETTERS, 2002, 38 (18) : 1010 - 1012
  • [6] A Survey on Lifting-based Discrete Wavelet Transform Architectures
    Tinku Acharya
    Chaitali Chakrabarti
    Journal of VLSI signal processing systems for signal, image and video technology, 2006, 42 : 321 - 339
  • [7] A survey on lifting-based Discrete Wavelet Transform architectures
    Acharya, T
    Chakrabarti, C
    JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2006, 42 (03): : 321 - 339
  • [8] Investigation of Lifting-Based Hardware Architectures for Discrete Wavelet Transform
    Salehi, Sayed Ahmad
    Sadri, Saeed
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2009, 28 (01) : 1 - 16
  • [9] An Efficient Architecture for Modified Lifting-Based Discrete Wavelet Transform
    Rohan Pinto
    Kumara Shama
    Sensing and Imaging, 2020, 21
  • [10] Investigation of Lifting-Based Hardware Architectures for Discrete Wavelet Transform
    Sayed Ahmad Salehi
    Saeed Sadri
    Circuits, Systems & Signal Processing, 2009, 28