GPGPU-based High Throughput Image Pre-processing Towards Large-Scale Optical Character Recognition

被引：0

作者：

Gener, Serhan ^{[1
]}

Dattilo, Parker ^{[1
]}

Gajaria, Dhruv ^{[1
]}

Fusco, Alexander ^{[1
]}

Akoglu, Ali ^{[1
]}

机构：

[1] Univ Arizona, Dept Elect & Comp Engn, Tucson, AZ 85721 USA

来源：

2022 IEEE/ACS 19TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA) | 2022年

基金：

美国国家科学基金会;

关键词：

Optical Character Recognition (OCR); Tesseract; Leptonica; Image Processing; CUDA; GPU;

D O I：

10.1109/AICCSA56895.2022.10017481

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Studies have shown that pre-processing digital images through scaling, rotation and blurring type of operations allow optical character recognition (OCR) to focus on the key features in the image and result in improving recognition accuracy. We leverage the open-source Tesseract OCR and show that its accuracy can be improved through a pre-processing flow that includes thresholding, rotation, rescaling, erosion, dilation, and noise removal steps based on a dataset that is formed of 560 phone screen images. However, the serial CPU-based implementation of this flow introduces a latency of 48.32 ms per image on average. Even though time scale is low in the context of a single image, this latency poses as a barrier when processing millions of images with OCR. To address this, we parallelize the entire pre-processing flow on the Nvidia P100 GPU, implement a streaming based execution, and reduce the latency to 0.846 ms. This streaming-enabled implementation enables setting up a GPU based OCR engine to process large scale workloads.

引用

页数：7

共 50 条

[21] A Feature Encoding based on Fuzzy Codebook for Large-Scale Image Recognition
Shinomiya, Yuki
Hoshino, Yukinobu
2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, : 2908 - 2913
[22] Enhancing the Image Pre-Processing for Large Fleets Based on a Fuzzy Approach to Handle Multiple Resolutions
Mu, Ching-Yun
Kung, Pin
APPLIED SCIENCES-BASEL, 2024, 14 (18):
[23] Towards effective science cloud provisioning for a large-scale high-throughput computing
Kim, Seoyoung
Kim, Jik-Soo
Hwang, Soonwook
Kim, Yoonhee
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2014, 17 (04): : 1157 - 1169
[24] Towards effective science cloud provisioning for a large-scale high-throughput computing
Seoyoung Kim
Jik-Soo Kim
Soonwook Hwang
Yoonhee Kim
Cluster Computing, 2014, 17 : 1157 - 1169
[25] Towards Real-Time Implementation for the Pre-Processing of Radar-Based Human Activity Recognition
Bordat, Alexandre
Dobias, Petr
Le Kernec, Julien
Guyard, David
Romain, Olivier
2022 IEEE 31ST INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2022, : 635 - 638
[26] Towards Large-Scale Histopathological Image Analysis: Hashing-Based Image Retrieval
Zhang, Xiaofan
Liu, Wei
Dundar, Murat
Badve, Sunil
Zhang, Shaoting
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2015, 34 (02) : 496 - 506
[27] Applying image pre-processing techniques for appearance-based human posture recognition: An experimental analysis
Rahman, MM
Ishikawa, S
AI 2004: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3339 : 152 - 159
[28] High efficient framework for large-scale zero-shot image recognition
Zhang Z.
Liu Q.
Guo D.
Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2022, 49 (06): : 103 - 110
[29] Spectral Difference in the Image Domain for Large Neighborhoods, a GEOBIA Pre-Processing Step for High Resolution Imagery
de Kok, Roeland
REMOTE SENSING, 2012, 4 (08): : 2294 - 2313
[30] Recognition of optical fiber pre-warning system based on image processing
Sun, Qian
Feng, Hao
Zeng, Zhou-Mo
Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2015, 23 (02): : 334 - 341

← 1 2 3 4 5 →