Reducing memory usage by the lifting-based discrete wavelet transform with a unified buffer on a GPU

被引:11
|
作者
Ikuzawa, Takuya [1 ,2 ]
Ino, Fumihiko [1 ]
Hagihara, Kenichi [1 ]
机构
[1] Osaka Univ, Grad Sch Informat Sci & Technol, 1-5 Yamada Oka, Suita, Osaka 5650871, Japan
[2] Toshiba Co Ltd, Ind ICT Solut Co, Saiwai Ku, 72-34 Horikawa Cho, Kawasaki, Kanagawa 2120013, Japan
基金
日本科学技术振兴机构; 日本学术振兴会;
关键词
Discrete wavelet transform; Lifting scheme; Memory-saving computation; In-place algorithm; GPU; IMPLEMENTATION;
D O I
10.1016/j.jpdc.2016.03.010
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this study, to improve the speed of the lifting-based discrete wavelet transform (DWT) for large-scale data, we propose a parallel method that achieves low memory usage and highly efficient memory access on a graphics processing unit (GPU). The proposed method reduces the memory usage by unifying the input buffer and output buffer but at the cost of a working memory region that is smaller than the data size it The method partitions the input data into small chunks, which are then rearranged into groups so different groups of chunks can be processed in parallel. This data rearrangement scheme classifies chunks in terms of data dependency but it also facilitates transformation via simultaneous access to contiguous memory regions, which can be handled efficiently by the GPU. In addition, this data rearrangement is interpreted as a product of circular permutations such that a sequence of seeds, which is an order of magnitude shorter than input data, allows the GPU threads to compute the complicated memory indexes needed for parallel rearrangement. Because the DWT is usually part of a processing pipeline in an application, we believe that the proposed method is useful for retaining the amount of memory for use by other pipeline stages. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:44 / 55
页数:12
相关论文
共 50 条
  • [31] Flipping structure: An efficient VLSI architecture for lifting-based discrete wavelet transform
    Huang, CT
    Tseng, PC
    Chen, LG
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2004, 52 (04) : 1080 - 1089
  • [32] A rescheduling and fast pipeline VLSI architecture for lifting-based discrete wavelet transform
    Wu, BF
    Lin, CF
    PROCEEDINGS OF THE 2003 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II: COMMUNICATIONS-MULTIMEDIA SYSTEMS & APPLICATIONS, 2003, : 732 - 735
  • [33] Efficient modified directional lifting-based discrete wavelet transform for moving object detection
    Hsia, Chih-Hsien
    Guo, Jing-Ming
    SIGNAL PROCESSING, 2014, 96 : 138 - 152
  • [34] A new VLSI architecture for lifting-based wavelet transform
    Fan, Wenbing
    Qin, Ruilin
    Cao, Xiaoguang
    2006 8TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-4, 2006, : 1103 - +
  • [35] A note on "flipping structure: An efficient VLSI architecture for lifting-based discrete wavelet transform"
    Xiong, CY
    Tian, JW
    Liu, H
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2006, 54 (05) : 1910 - 1916
  • [36] Efficient VLSI architectures of lifting-based discrete wavelet transform by systematic design method
    Huang, CT
    Tseng, PC
    Chen, AG
    2002 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL V, PROCEEDINGS, 2002, : 565 - 568
  • [37] Electrocardiogram signals de-noising using lifting-based discrete wavelet transform
    Erçelebi, E
    COMPUTERS IN BIOLOGY AND MEDICINE, 2004, 34 (06) : 479 - 493
  • [38] Efficient parallel architecture for lifting-based two-dimensional discrete wavelet transform
    Xiong, CY
    Tian, JW
    Liu, J
    PROCEEDINGS OF 2005 IEEE INTERNATIONAL WORKSHOP ON VLSI DESIGN AND VIDEO TECHNOLOGY, 2005, : 75 - 78
  • [39] Memory Efficient Hardware Architecture for 5/3 Lifting-Based 2-D Forward Discrete Wavelet Transform
    Savic, Goran
    Prokin, Milan
    Rajovic, Vladimir M.
    Prokin, Dragana
    MICROPROCESSORS AND MICROSYSTEMS, 2021, 87
  • [40] Hardware Architecture of Lifting-based Discrete Wavelet Transform and Sample Entropy for Epileptic Seizure Detection
    Wang, Yuanfa
    Li, Zunchao
    Feng, Lichen
    Zheng, Chuang
    Guan, Yunhe
    Zhang, Yefei
    2016 13TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2016, : 1582 - 1584