An Algorithmic Approach for Text Recognition from Printed/Typed Text Images

被引:0
|
作者
Agrawal, Neha [1 ]
Kaur, Arashdeep [1 ]
机构
[1] Amity Univ Uttar Pradesh, Amity Sch Engn & Technol, Dept Comp Sci & Engn, Noida, India
关键词
OCR; Otsu's algorithm; Hough transform; English alphabets; skew detection;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Extraction of texts from scanned copies of documents and text images is an important task in the recent scenario. Optical Character Recognition (OCR) is used to analyze text in images. The proposed algorithm deals with taking scanned copy of a document as an input and extract texts from the image into a text format using Otsu's algorithm for segmentation and Hough transform method for skew detection. The system was confined to recognize English alphabets (A-Z, a-z) and numerals (0-9). OCR technique has been implemented to recognize characters. Validation tests were done on screenshots of typed texts and images of scanned document from Internet sources. Experimental results indicate that the proposed algorithm is able to recognize alphabets written in Verdana font style with size 14 and also showed good results with rotated images. The average accuracy to determine rotation angle correctly was calculated to be 90% and overall system accuracy was calculated to be 93%.
引用
收藏
页码:876 / 879
页数:4
相关论文
共 50 条
  • [1] Text Recognition from Images
    Manwatkar, Pratik Madhukar
    Yadav, Shashank H.
    [J]. 2015 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2015,
  • [2] BPTI: Bilingual Printed Text Images Dataset for Recognition Purposes
    Yahia, Mohammad
    Al-Muhtaseb, Husni
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2023, 20 (04) : 655 - 668
  • [3] Text Recognition for Information Retrieval in Images of Printed Circuit Boards
    Li, Wei
    Neullens, Stefan
    Breier, Matthias
    Bosling, Marcel
    Pretz, Thomas
    Merhof, Dorit
    [J]. IECON 2014 - 40TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2014, : 3487 - 3493
  • [4] PRINTED ARABIC TEXT RECOGNITION
    HASSAN, FH
    ALI, WH
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 1991, 16 (04): : 511 - 518
  • [5] Text recognition on images from social media
    Akopyan, M. S.
    Belyaeva, O. V.
    Plechov, T. P.
    Turdakov, D. Y.
    [J]. 2019 IVANNIKOV MEMORIAL WORKSHOP (IVMEM 2019), 2019, : 3 - 6
  • [6] Technical Review on Text Recognition from Images
    Manwatkar, Pratik Madhukar
    Singh, Kavita R.
    [J]. PROCEEDINGS OF 2015 IEEE 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO), 2015,
  • [7] Recognition as translating images into text
    Barnard, K
    Duygulu, P
    Forsyth, D
    [J]. INTERNET IMAGING IV, 2003, 5018 : 168 - 178
  • [8] Machine recognition of printed Kannada text
    Kumar, BV
    Ramakrishnan, AG
    [J]. DOCUMENT ANALYSIS SYSTEM V, PROCEEDINGS, 2002, 2423 : 37 - 48
  • [9] Computational modelling of an optical character recognition system for Yoruba printed text images
    Oni, Olalekan Joseph
    Asahiah, Franklin Oladiipo
    [J]. SCIENTIFIC AFRICAN, 2020, 9
  • [10] Optical character recognition program for images of printed text using a neural network
    Ganapathy, Velappa
    Lean, Charles C. H.
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY, VOLS 1-6, 2006, : 1174 - +