Techniques for efficient DCT/IDCT implementation on generic GPU

被引:0
|
作者
Fang, B [1 ]
Shen, GB [1 ]
Li, SP [1 ]
Chen, HF [1 ]
机构
[1] Zhejiang Univ, Dept Informat Sci & Elect Eng, Hangzhou 310027, Peoples R China
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Emergence of programmable graphics processing unit has led to increasing interest in offloading numerically intensive computations on graphics hardware. DCT/IDCT is widely adopted in modern image/video compression standards and is usually one of the most computational expensive parts. In this paper, we present several techniques for efficient implementation of DCT/IDCT on generic programmable GPU, using direct matrix multiplication. Our experimental results demonstrate that the speed of IDCT on GPU with the proposed techniques can well exceed that on CPU with MM optimization.
引用
收藏
页码:1126 / 1129
页数:4
相关论文
共 50 条
  • [1] An efficient method for hardware based DCT/IDCT implementation
    Sun, XT
    Wu, CK
    [J]. NEURAL NETWORK AND DISTRIBUTED PROCESSING, 2001, 4555 : 6 - 10
  • [2] Optimization and implementation on FPGA of the DCT/IDCT algorithm
    Ben Atitallah, A.
    Kadionik, P.
    Ghozzi, F.
    Nouel, P.
    Masmoudi, N.
    Marchegay, Ph.
    [J]. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 3379 - 3382
  • [3] On Hardware Implementation of DCT/IDCT for Image Processing
    Elhamzi, W.
    Saidani, T.
    Atri, M.
    Tourki, R.
    [J]. SCS: 2008 2ND INTERNATIONAL CONFERENCE ON SIGNALS, CIRCUITS AND SYSTEMS, 2008, : 316 - 319
  • [4] A fast algorithm of the DCT and IDCT for VLSI implementation
    Hong, Y
    Hou, ZH
    [J]. ICSP '96 - 1996 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1996, : 637 - 640
  • [5] An efficient unified framework for implementation of a prime-length DCT/IDCT with high throughput
    Chiper, Doru-Morin
    Swamy, M. N. S.
    Ahmad, M. Omair
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2007, 55 (06) : 2925 - 2936
  • [6] The design and implementation of DCT/IDCT chip with novel architecture
    Cheng, KH
    Huang, CS
    Lin, CP
    [J]. ISCAS 2000: IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - PROCEEDINGS, VOL IV: EMERGING TECHNOLOGIES FOR THE 21ST CENTURY, 2000, : 741 - 744
  • [7] Vector-radix DCT/IDCT implementation for MPEG DSP
    Liu, MN
    [J]. ICSP '96 - 1996 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1996, : 641 - 644
  • [8] Implementation of DCT and IDCT Based Image Compression and Decompression on FPGA
    Singh, Kamlesh Kumar
    Pandey, Deependra
    [J]. PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON INVENTIVE SYSTEMS AND CONTROL (ICISC 2017), 2017, : 803 - 806
  • [9] Hardware Implementation of DCT/IDCT sharing for HEVC/MPEG Video Coding
    Wu, En-Pei
    Bui, Trong-An
    Chen, Kermit
    Lee, Pei-Jun
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2020, : 721 - 723
  • [10] A cost-efficient and fully-pipelinable architecture for DCT/IDCT
    Hsiao, SF
    Shiue, WR
    Tseng, JM
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 1999, 45 (03) : 515 - 525