An Approach of Strike-through Text Identification from Handwritten Documents

被引:11
|
作者
Adak, Chandranath [1 ]
Chaudhuri, Bidyut B. [2 ]
机构
[1] Kalyani Univ, Dept Comp Sci & Engn, Kalyani 741235, W Bengal, India
[2] Indian Stat Inst, Comp Vis & Pattern Recognit Unit, Kolkata 700108, India
关键词
Document image analysis; Handwritten document; Optical character recognition; Strike-through text;
D O I
10.1109/ICFHR.2014.113
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A handwritten document may contain strike-through texts. If such texts are fed into an OCR system, the output will be garbage. In this paper, we propose a scheme to detect such strike-through texts/words. Using a graph based model, we represent a textual connected component as a graph. The start/end and intersection points of the ink-strokes of a component are marked as graph nodes. There exists an edge between two nodes if they are connected by object (ink) pixels. By eliminating parallel edges and self loops we obtain a simple, undirected, edge-weighted graph of the text-component. The edge-weight is found by adding horizontal/vertical moves weighted by 1 and diagonal moves weighted by root 2. In this graph, we find the shortest path vvhich is nearly as long as the width of the text component and maintains a reasonable degree of straightness. This path, if exist, is identified as the strike-through line. Here we deal with handwritten documents in English, Bengali and Devanagari script. Our approach delivers fairly good results.
引用
收藏
页码:643 / 648
页数:6
相关论文
共 50 条
  • [1] A Tracking Approach for Text Line Segmentation in Handwritten Documents
    Setitra, Insaf
    Hadjadj, Zineb
    Meziane, Abdelkrim
    [J]. ICPRAM: PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS, 2017, : 193 - 198
  • [2] Language Identification from Handwritten Documents
    Mioulet, Luc
    Garain, Utpal
    Chatelain, Clement
    Barlas, Philippine
    Paquet, Thierry
    [J]. 2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 676 - 680
  • [3] Text alignment with handwritten documents
    Kornfield, EM
    Manmatha, R
    Allan, J
    [J]. FIRST INTERNATIONAL WORKSHOP ON DOCUMENT IMAGE ANALYSIS FOR LIBRARIES, PROCEEDINGS, 2004, : 195 - 209
  • [4] Framework for Human Identification through Offline Handwritten Documents
    Khalid, Shehzad
    Naqvi, Uzma
    Siddiqi, Imran
    [J]. 2015 2ND INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATIONS, AND CONTROL TECHNOLOGY (I4CT), 2015,
  • [5] Preserving Text Content from Historical Handwritten Documents
    Chakraborty, Arpita
    Blumenstein, Michael
    [J]. PROCEEDINGS OF 12TH IAPR WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS, (DAS 2016), 2016, : 329 - 334
  • [6] Handwritten text localization in skewed documents
    Kavallieratou, E
    Balcan, DC
    Popa, MF
    Fakotakis, N
    [J]. 2001 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL I, PROCEEDINGS, 2001, : 1102 - 1105
  • [7] Identifying handwritten text in mixed documents
    Farooq, Faisal
    Sridharan, Karthik
    Govindaraju, Venu
    [J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS, 2006, : 1142 - +
  • [8] Detecting text lines in handwritten documents
    Li, Yi
    Zheng, Yefeng
    Doermann, David
    [J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS, 2006, : 1030 - +
  • [9] Text Reuse Detection in Handwritten Documents
    Grabovoy, A. V.
    Kaprielova, M. S.
    Kildyakov, A. S.
    Potyashin, I. O.
    Seyil, T. B.
    Finogeev, E. L.
    Chekhovich, Yu. V.
    [J]. DOKLADY MATHEMATICS, 2023, 108 (SUPPL 2) : S424 - S433
  • [10] A robust approach to text line grouping in online handwritten Japanese documents
    Zhou, Xiang-Dong
    Wang, Da-Han
    Liu, Cheng-Lin
    [J]. PATTERN RECOGNITION, 2009, 42 (09) : 2077 - 2088