Efficient Parallel Implementation of Morphological Operation on GPU and FPGA

被引:0
|
作者
Li, Teng [1 ]
Dou, Yong [1 ]
Jiang, Jingfei [1 ]
Gao, Jing [2 ]
机构
[1] Natl Univ Def & Technol, Natl Lab Parallel & Distributed Proc, Changsha, Hunan, Peoples R China
[2] AIR CHINA LTD, Informat Management Dept, Beijing, Peoples R China
关键词
Morphological operation; Computer vision; GPU; FPGA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Morphological operation constitutes one of a powerful and versatile image and video applications applied to a wide range of domains, from object recognition, to feature extraction and to moving objects detection in computer vision where real-time and high-performance are required. However, the throughput of morphological operation is constrained by the convolutional characteristic. In this paper, we analysis the parallelism of morphological operation and parallel implementations on the graphics processing unit (GPU), and field programming gate array (FPGA) are presented. For GPU platform, we propose the optimized schemes based on global memory, texture memory and shared memory, achieving the throughput of 942.63 Mbps with 3x3 structuring element. For FPGA platform, we present an optimized method based on the traditional delay-line architecture. For 3x3 structuring element, it achieves a throughput of 462.64 Mbps.
引用
收藏
页码:430 / 435
页数:6
相关论文
共 50 条
  • [1] FPGA based Parallel Implementation of Morphological Filters
    Mukherjee, Debasish
    Mukhopadhyay, Susanta
    Biswas, G. P.
    [J]. 2016 INTERNATIONAL CONFERENCE ON MICROELECTRONICS, COMPUTING AND COMMUNICATIONS (MICROCOM), 2016,
  • [2] Efficient parallel implementation of three-point viterbi decoding algorithm on CPU, GPU, and FPGA
    Li, Rongchun
    Dou, Yong
    Zou, Dan
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2014, 26 (03): : 821 - 840
  • [3] Complex Morphological Filtering For Serial, Parallel, GPU, SoC, PetaLinux And FPGA Execution
    Almeida, T. B.
    Pedrino, E. C.
    Fernandes, M. M.
    [J]. IEEE LATIN AMERICA TRANSACTIONS, 2020, 18 (10) : 1675 - 1682
  • [4] A Power Efficient Neural Network Implementation on Heterogeneous FPGA and GPU Devices
    Tu, Yuexuan
    Sadiq, Saad
    Tao, Yudong
    Shyu, Mei-Ling
    Chen, Shu-Ching
    [J]. 2019 IEEE 20TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2019), 2019, : 193 - 199
  • [5] Efficient implementation of Sobel edge detection algorithm on CPU, GPU and FPGA
    [J]. Chouchene, M. (ch.marwa.84@gmail.com), 1600, Inderscience Enterprises Ltd. (05): : 2 - 3
  • [6] A Efficient Parallel Deblocking Filter Based on GPU: Implementation and Optimization
    Su, Huayou
    Zhang, Chunyuan
    Chai, Jun
    Yang, Qianming
    [J]. 2011 IEEE PACIFIC RIM CONFERENCE ON COMMUNICATIONS, COMPUTERS AND SIGNAL PROCESSING (PACRIM), 2011, : 280 - 285
  • [7] Efficient FPGA Implementation of Digit Parallel Online Arithmetic Operators
    Shi, Kan
    Boland, David
    Constantinides, George A.
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), 2014, : 115 - 122
  • [8] An Efficient Parallel Implementation of an Optimized Simplex Method in GPU-CUDA
    Silva, V. O.
    Ekel, P. Y.
    Martins, C. A. P. S.
    [J]. IEEE LATIN AMERICA TRANSACTIONS, 2018, 16 (02) : 564 - 573
  • [9] Efficient Implementation of Parallel Symmetric Matrix Tridiagonalization Algorithm on GPU Cluster
    Liu, Shifang
    Zhao, Yonghua
    Yu, Tianyu
    Huang, Rongfeng
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2020, 57 (12): : 2635 - 2647
  • [10] Efficient Parallel Implementation of Active Appearance Model Fitting Algorithm on GPU
    Wang, Jinwei
    Ma, Xirong
    Zhu, Yuanping
    Sun, Jizhou
    [J]. SCIENTIFIC WORLD JOURNAL, 2014,