GPGPU-based High Throughput Image Pre-processing Towards Large-Scale Optical Character Recognition

Serhan Gener, Parker Dattilo, Dhruv Gajaria, Alexander Fusco, Ali Akoglu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Studies have shown that pre-processing digital images through scaling, rotation and blurring type of operations allow optical character recognition (OCR) to focus on the key features in the image and result in improving recognition accuracy. We leverage the open-source Tesseract OCR and show that its accuracy can be improved through a pre-processing flow that includes thresholding, rotation, rescaling, erosion, dilation, and noise removal steps based on a dataset that is formed of 560 phone screen images. However, the serial CPU-based implementation of this flow introduces a latency of 48.32 ms per image on average. Even though time scale is low in the context of a single image, this latency poses as a barrier when processing millions of images with OCR. To address this, we parallelize the entire pre-processing flow on the Nvidia P100 GPU, implement a streaming based execution, and reduce the latency to 0.846 ms. This streaming-enabled implementation enables setting up a GPU based OCR engine to process large scale workloads.

Original languageEnglish (US)
Title of host publication2022 IEEE/ACS 19th International Conference on Computer Systems and Applications, AICCSA 2022 - Proceedings
PublisherIEEE Computer Society
ISBN (Electronic)9798350310085
DOIs
StatePublished - 2022
Externally publishedYes
Event19th IEEE/ACS International Conference on Computer Systems and Applications, AICCSA 2022 - Abu Dhabi, United Arab Emirates
Duration: Dec 5 2022Dec 7 2022

Publication series

NameProceedings of IEEE/ACS International Conference on Computer Systems and Applications, AICCSA
Volume2022-December
ISSN (Print)2161-5322
ISSN (Electronic)2161-5330

Conference

Conference19th IEEE/ACS International Conference on Computer Systems and Applications, AICCSA 2022
Country/TerritoryUnited Arab Emirates
CityAbu Dhabi
Period12/5/2212/7/22

Keywords

  • CUDA
  • GPU
  • Image Processing
  • Leptonica
  • Optical Character Recognition (OCR)
  • Tesser-act

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture
  • Signal Processing
  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'GPGPU-based High Throughput Image Pre-processing Towards Large-Scale Optical Character Recognition'. Together they form a unique fingerprint.

Cite this