GPGPU-based High Throughput Image Pre-processing Towards Large-Scale Optical Character Recognition

被引:0
|
作者
Gener, Serhan [1 ]
Dattilo, Parker [1 ]
Gajaria, Dhruv [1 ]
Fusco, Alexander [1 ]
Akoglu, Ali [1 ]
机构
[1] Univ Arizona, Dept Elect & Comp Engn, Tucson, AZ 85721 USA
来源
2022 IEEE/ACS 19TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA) | 2022年
基金
美国国家科学基金会;
关键词
Optical Character Recognition (OCR); Tesseract; Leptonica; Image Processing; CUDA; GPU;
D O I
10.1109/AICCSA56895.2022.10017481
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Studies have shown that pre-processing digital images through scaling, rotation and blurring type of operations allow optical character recognition (OCR) to focus on the key features in the image and result in improving recognition accuracy. We leverage the open-source Tesseract OCR and show that its accuracy can be improved through a pre-processing flow that includes thresholding, rotation, rescaling, erosion, dilation, and noise removal steps based on a dataset that is formed of 560 phone screen images. However, the serial CPU-based implementation of this flow introduces a latency of 48.32 ms per image on average. Even though time scale is low in the context of a single image, this latency poses as a barrier when processing millions of images with OCR. To address this, we parallelize the entire pre-processing flow on the Nvidia P100 GPU, implement a streaming based execution, and reduce the latency to 0.846 ms. This streaming-enabled implementation enables setting up a GPU based OCR engine to process large scale workloads.
引用
收藏
页数:7
相关论文
共 50 条
  • [31] Towards Approximate Event Processing in a Large-Scale Content-Based Network
    Zhao, Yaxiong
    Wu, Jie
    31ST INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2011), 2011, : 790 - 799
  • [32] Fusion Based Deep CNN for Improved Large-Scale Image Action Recognition
    Lavinia, Yukhe
    Vo, Holly H.
    Verma, Abhishek
    PROCEEDINGS OF 2016 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2016, : 609 - 614
  • [33] Automation methodologies and large-scale validation for GW: Towards high-throughput GW calculations
    van Setten, M. J.
    Giantomassi, M.
    Gonze, X.
    Rignanese, G. -M.
    Hautier, G.
    PHYSICAL REVIEW B, 2017, 96 (15)
  • [34] Circular sector DCT based feature extraction for enhanced Face Recognition with image segmentation as a pre-processing step
    Abhishek, A.K.
    Aneesh, M.U.
    Arun, B.V.
    Yaradoni, Darshan Kumar S.
    Manikantan, K.
    Ramachandran, S.
    International Review on Computers and Software, 2012, 7 (05) : 1954 - 1968
  • [35] Visual analytics towards axle health of high-speed train based on large-scale scatter image
    Kunlin Zhang
    Jihui Xu
    Huaiyu Xu
    Ruidan Su
    Multimedia Tools and Applications, 2020, 79 : 16663 - 16681
  • [36] Visual analytics towards axle health of high-speed train based on large-scale scatter image
    Zhang, Kunlin
    Xu, Jihui
    Xu, Huaiyu
    Su, Ruidan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (23-24) : 16663 - 16681
  • [37] Towards Building A Robust Large-Scale Bangla Text Recognition Solution Using A Unique Multiple-Domain Character-Based Document Recognition Approach
    Rabby, A. K. M. Shahariar Azad
    Islam, Md Majedul
    Islam, Zahidul
    Hasan, Nazmul
    Rahman, Fuad
    20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 1393 - 1399
  • [38] Noise Removal Based Query Pre-processing to Improve Face Search Performance in Large Scale Video Databases
    Hung-Quoc Vo
    Vu-Minh-Hieu Dang
    Vinh-Tiep Nguyen
    Duy-Dinh Le
    SOICT 2019: PROCEEDINGS OF THE TENTH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY, 2019, : 357 - 361
  • [39] Reducing Weight Precision of Convolutional Neural Networks towards Large-scale On-chip Image Recognition
    Ji, Zhengping
    Ovsiannikov, Ilia
    Wang, Yibing
    Shi, Lilong
    Zhang, Qiang
    INDEPENDENT COMPONENT ANALYSES, COMPRESSIVE SAMPLING, LARGE DATA ANALYSES (LDA), NEURAL NETWORKS, BIOSYSTEMS, AND NANOENGINEERING XIII, 2015, 9496
  • [40] Large-Scale Civil Engineering Structure Deformation Monitoring Research Based on Image Recognition
    Yan, Xiaodong
    Song, Xiaogang
    TRAITEMENT DU SIGNAL, 2023, 40 (02) : 501 - 509