Efficient Parallel Implementation of Morphological Operation on GPU and FPGA

被引：0

作者：

Li, Teng ^{[1
]}

Dou, Yong ^{[1
]}

Jiang, Jingfei ^{[1
]}

Gao, Jing ^{[2
]}

机构：

[1] Natl Univ Def & Technol, Natl Lab Parallel & Distributed Proc, Changsha, Hunan, Peoples R China

[2] AIR CHINA LTD, Informat Management Dept, Beijing, Peoples R China

来源：

2014 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC) | 2014年

关键词：

Morphological operation; Computer vision; GPU; FPGA;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Morphological operation constitutes one of a powerful and versatile image and video applications applied to a wide range of domains, from object recognition, to feature extraction and to moving objects detection in computer vision where real-time and high-performance are required. However, the throughput of morphological operation is constrained by the convolutional characteristic. In this paper, we analysis the parallelism of morphological operation and parallel implementations on the graphics processing unit (GPU), and field programming gate array (FPGA) are presented. For GPU platform, we propose the optimized schemes based on global memory, texture memory and shared memory, achieving the throughput of 942.63 Mbps with 3x3 structuring element. For FPGA platform, we present an optimized method based on the traditional delay-line architecture. For 3x3 structuring element, it achieves a throughput of 462.64 Mbps.

引用

页码：430 / 435

页数：6

共 50 条

[1] FPGA based Parallel Implementation of Morphological Filters
Mukherjee, Debasish
Mukhopadhyay, Susanta
Biswas, G. P.
[J]. 2016 INTERNATIONAL CONFERENCE ON MICROELECTRONICS, COMPUTING AND COMMUNICATIONS (MICROCOM), 2016,
[2] Efficient parallel implementation of three-point viterbi decoding algorithm on CPU, GPU, and FPGA
Li, Rongchun
Dou, Yong
Zou, Dan
[J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2014, 26 (03): : 821 - 840
[3] Complex Morphological Filtering For Serial, Parallel, GPU, SoC, PetaLinux And FPGA Execution
Almeida, T. B.
Pedrino, E. C.
Fernandes, M. M.
[J]. IEEE LATIN AMERICA TRANSACTIONS, 2020, 18 (10) : 1675 - 1682
[4] A Power Efficient Neural Network Implementation on Heterogeneous FPGA and GPU Devices
Tu, Yuexuan
Sadiq, Saad
Tao, Yudong
Shyu, Mei-Ling
Chen, Shu-Ching
[J]. 2019 IEEE 20TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2019), 2019, : 193 - 199
[5] Efficient implementation of Sobel edge detection algorithm on CPU, GPU and FPGA
[J]. Chouchene, M. (ch.marwa.84@gmail.com), 1600, Inderscience Enterprises Ltd. (05): : 2 - 3
[6] A Efficient Parallel Deblocking Filter Based on GPU: Implementation and Optimization
Su, Huayou
Zhang, Chunyuan
Chai, Jun
Yang, Qianming
[J]. 2011 IEEE PACIFIC RIM CONFERENCE ON COMMUNICATIONS, COMPUTERS AND SIGNAL PROCESSING (PACRIM), 2011, : 280 - 285
[7] Efficient FPGA Implementation of Digit Parallel Online Arithmetic Operators
Shi, Kan
Boland, David
Constantinides, George A.
[J]. PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), 2014, : 115 - 122
[8] An Efficient Parallel Implementation of an Optimized Simplex Method in GPU-CUDA
Silva, V. O.
Ekel, P. Y.
Martins, C. A. P. S.
[J]. IEEE LATIN AMERICA TRANSACTIONS, 2018, 16 (02) : 564 - 573
[9] Efficient Implementation of Parallel Symmetric Matrix Tridiagonalization Algorithm on GPU Cluster
Liu, Shifang
Zhao, Yonghua
Yu, Tianyu
Huang, Rongfeng
[J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2020, 57 (12): : 2635 - 2647
[10] Efficient Parallel Implementation of Active Appearance Model Fitting Algorithm on GPU
Wang, Jinwei
Ma, Xirong
Zhu, Yuanping
Sun, Jizhou
[J]. SCIENTIFIC WORLD JOURNAL, 2014,

← 1 2 3 4 5 →