GPGPU-based High Throughput Image Pre-processing Towards Large-Scale Optical Character Recognition

被引:0
|
作者
Gener, Serhan [1 ]
Dattilo, Parker [1 ]
Gajaria, Dhruv [1 ]
Fusco, Alexander [1 ]
Akoglu, Ali [1 ]
机构
[1] Univ Arizona, Dept Elect & Comp Engn, Tucson, AZ 85721 USA
来源
2022 IEEE/ACS 19TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA) | 2022年
基金
美国国家科学基金会;
关键词
Optical Character Recognition (OCR); Tesseract; Leptonica; Image Processing; CUDA; GPU;
D O I
10.1109/AICCSA56895.2022.10017481
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Studies have shown that pre-processing digital images through scaling, rotation and blurring type of operations allow optical character recognition (OCR) to focus on the key features in the image and result in improving recognition accuracy. We leverage the open-source Tesseract OCR and show that its accuracy can be improved through a pre-processing flow that includes thresholding, rotation, rescaling, erosion, dilation, and noise removal steps based on a dataset that is formed of 560 phone screen images. However, the serial CPU-based implementation of this flow introduces a latency of 48.32 ms per image on average. Even though time scale is low in the context of a single image, this latency poses as a barrier when processing millions of images with OCR. To address this, we parallelize the entire pre-processing flow on the Nvidia P100 GPU, implement a streaming based execution, and reduce the latency to 0.846 ms. This streaming-enabled implementation enables setting up a GPU based OCR engine to process large scale workloads.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] A Feature Encoding based on Fuzzy Codebook for Large-Scale Image Recognition
    Shinomiya, Yuki
    Hoshino, Yukinobu
    2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, : 2908 - 2913
  • [22] Enhancing the Image Pre-Processing for Large Fleets Based on a Fuzzy Approach to Handle Multiple Resolutions
    Mu, Ching-Yun
    Kung, Pin
    APPLIED SCIENCES-BASEL, 2024, 14 (18):
  • [23] Towards effective science cloud provisioning for a large-scale high-throughput computing
    Kim, Seoyoung
    Kim, Jik-Soo
    Hwang, Soonwook
    Kim, Yoonhee
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2014, 17 (04): : 1157 - 1169
  • [24] Towards effective science cloud provisioning for a large-scale high-throughput computing
    Seoyoung Kim
    Jik-Soo Kim
    Soonwook Hwang
    Yoonhee Kim
    Cluster Computing, 2014, 17 : 1157 - 1169
  • [25] Towards Real-Time Implementation for the Pre-Processing of Radar-Based Human Activity Recognition
    Bordat, Alexandre
    Dobias, Petr
    Le Kernec, Julien
    Guyard, David
    Romain, Olivier
    2022 IEEE 31ST INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2022, : 635 - 638
  • [26] Towards Large-Scale Histopathological Image Analysis: Hashing-Based Image Retrieval
    Zhang, Xiaofan
    Liu, Wei
    Dundar, Murat
    Badve, Sunil
    Zhang, Shaoting
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2015, 34 (02) : 496 - 506
  • [27] Applying image pre-processing techniques for appearance-based human posture recognition: An experimental analysis
    Rahman, MM
    Ishikawa, S
    AI 2004: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3339 : 152 - 159
  • [28] High efficient framework for large-scale zero-shot image recognition
    Zhang Z.
    Liu Q.
    Guo D.
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2022, 49 (06): : 103 - 110
  • [29] Spectral Difference in the Image Domain for Large Neighborhoods, a GEOBIA Pre-Processing Step for High Resolution Imagery
    de Kok, Roeland
    REMOTE SENSING, 2012, 4 (08): : 2294 - 2313
  • [30] Recognition of optical fiber pre-warning system based on image processing
    Sun, Qian
    Feng, Hao
    Zeng, Zhou-Mo
    Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2015, 23 (02): : 334 - 341