GPGPU-based High Throughput Image Pre-processing Towards Large-Scale Optical Character Recognition

被引：0

作者：

Gener, Serhan ^{[1
]}

Dattilo, Parker ^{[1
]}

Gajaria, Dhruv ^{[1
]}

Fusco, Alexander ^{[1
]}

Akoglu, Ali ^{[1
]}

机构：

[1] Univ Arizona, Dept Elect & Comp Engn, Tucson, AZ 85721 USA

来源：

2022 IEEE/ACS 19TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA) | 2022年

基金：

美国国家科学基金会;

关键词：

Optical Character Recognition (OCR); Tesseract; Leptonica; Image Processing; CUDA; GPU;

D O I：

10.1109/AICCSA56895.2022.10017481

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Studies have shown that pre-processing digital images through scaling, rotation and blurring type of operations allow optical character recognition (OCR) to focus on the key features in the image and result in improving recognition accuracy. We leverage the open-source Tesseract OCR and show that its accuracy can be improved through a pre-processing flow that includes thresholding, rotation, rescaling, erosion, dilation, and noise removal steps based on a dataset that is formed of 560 phone screen images. However, the serial CPU-based implementation of this flow introduces a latency of 48.32 ms per image on average. Even though time scale is low in the context of a single image, this latency poses as a barrier when processing millions of images with OCR. To address this, we parallelize the entire pre-processing flow on the Nvidia P100 GPU, implement a streaming based execution, and reduce the latency to 0.846 ms. This streaming-enabled implementation enables setting up a GPU based OCR engine to process large scale workloads.

引用

页数：7

共 50 条

[31] Towards Approximate Event Processing in a Large-Scale Content-Based Network
Zhao, Yaxiong
Wu, Jie
31ST INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2011), 2011, : 790 - 799
[32] Fusion Based Deep CNN for Improved Large-Scale Image Action Recognition
Lavinia, Yukhe
Vo, Holly H.
Verma, Abhishek
PROCEEDINGS OF 2016 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2016, : 609 - 614
[33] Automation methodologies and large-scale validation for GW: Towards high-throughput GW calculations
van Setten, M. J.
Giantomassi, M.
Gonze, X.
Rignanese, G. -M.
Hautier, G.
PHYSICAL REVIEW B, 2017, 96 (15)
[34] Circular sector DCT based feature extraction for enhanced Face Recognition with image segmentation as a pre-processing step
Abhishek, A.K.
Aneesh, M.U.
Arun, B.V.
Yaradoni, Darshan Kumar S.
Manikantan, K.
Ramachandran, S.
International Review on Computers and Software, 2012, 7 (05) : 1954 - 1968
[35] Visual analytics towards axle health of high-speed train based on large-scale scatter image
Kunlin Zhang
Jihui Xu
Huaiyu Xu
Ruidan Su
Multimedia Tools and Applications, 2020, 79 : 16663 - 16681
[36] Visual analytics towards axle health of high-speed train based on large-scale scatter image
Zhang, Kunlin
Xu, Jihui
Xu, Huaiyu
Su, Ruidan
MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (23-24) : 16663 - 16681
[37] Towards Building A Robust Large-Scale Bangla Text Recognition Solution Using A Unique Multiple-Domain Character-Based Document Recognition Approach
Rabby, A. K. M. Shahariar Azad
Islam, Md Majedul
Islam, Zahidul
Hasan, Nazmul
Rahman, Fuad
20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 1393 - 1399
[38] Noise Removal Based Query Pre-processing to Improve Face Search Performance in Large Scale Video Databases
Hung-Quoc Vo
Vu-Minh-Hieu Dang
Vinh-Tiep Nguyen
Duy-Dinh Le
SOICT 2019: PROCEEDINGS OF THE TENTH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY, 2019, : 357 - 361
[39] Reducing Weight Precision of Convolutional Neural Networks towards Large-scale On-chip Image Recognition
Ji, Zhengping
Ovsiannikov, Ilia
Wang, Yibing
Shi, Lilong
Zhang, Qiang
INDEPENDENT COMPONENT ANALYSES, COMPRESSIVE SAMPLING, LARGE DATA ANALYSES (LDA), NEURAL NETWORKS, BIOSYSTEMS, AND NANOENGINEERING XIII, 2015, 9496
[40] Large-Scale Civil Engineering Structure Deformation Monitoring Research Based on Image Recognition
Yan, Xiaodong
Song, Xiaogang
TRAITEMENT DU SIGNAL, 2023, 40 (02) : 501 - 509

← 1 2 3 4 5 →