A graph-based approach for segmenting touching lines in historical handwritten documents

被引:0
|
作者
David Fernández-Mota
Josep Lladós
Alicia Fornés
机构
[1] Universitat Autònoma de Barcelona,Computer Vision Center—Computer Science Department
关键词
Text line segmentation; Handwritten documents; Document image processing; Historical document analysis;
D O I
暂无
中图分类号
学科分类号
摘要
Text line segmentation in handwritten documents is an important task in the recognition of historical documents. Handwritten document images contain text lines with multiple orientations, touching and overlapping characters between consecutive text lines and different document structures, making line segmentation a difficult task. In this paper, we present a new approach for handwritten text line segmentation solving the problems of touching components, curvilinear text lines and horizontally overlapping components. The proposed algorithm formulates line segmentation as finding the central path in the area between two consecutive lines. This is solved as a graph traversal problem. A graph is constructed using the skeleton of the image. Then, a path-finding algorithm is used to find the optimum path between text lines. The proposed algorithm has been evaluated on a comprehensive dataset consisting of five databases: ICDAR2009, ICDAR2013, UMD, the George Washington and the Barcelona Marriages Database. The proposed method outperforms the state-of-the-art considering the different types and difficulties of the benchmarking data.
引用
收藏
页码:293 / 312
页数:19
相关论文
共 50 条
  • [21] Segmentation of Historical Handwritten Documents into Text Zones and Text Lines
    Gatos, Basilis
    Louloudis, Georgios
    Stamatopoulos, Nikolaos
    2014 14TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2014, : 464 - 469
  • [22] A human in the loop approach to historical handwritten documents transcription
    Santoro, Adolfo
    Parziale, Antonio
    Marcelli, Angelo
    PROCEEDINGS OF 2016 15TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2016, : 222 - 227
  • [23] A graph-based approach for extracting terminological properties of elements of XML documents
    Palopoli, L
    Terracina, G
    Ursino, D
    17TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2001, : 330 - 337
  • [24] Template based Segmentation of Touching Components in Handwritten Text Lines
    Kang, Le
    Doermann, David
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 569 - 573
  • [25] Separating lines of text in free-form handwritten historical documents
    Kennard, Douglas J.
    Barrett, William A.
    SECOND INTERNATIONAL CONFERENCE ON DOCUMENT IMAGE ANALYSIS FOR LIBRARIES, PROCEEDINGS, 2006, : 12 - +
  • [26] Graph-based Layout Analysis for PDF Documents
    Xu, Canhui
    Tang, Zhi
    Tao, Xin
    Li, Yun
    Shi, Cao
    IMAGING AND PRINTING IN A WEB 2.0 WORLD IV, 2013, 8664
  • [27] A Recognition based Approach for segmenting Touching Components in Arabic Manuscripts
    Aouadi, Nabil
    Echi, Afef Kacem
    Belaid, Abdel
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 21 - 25
  • [28] A graph-based solution for writer identification from handwritten text
    Rahman, Atta Ur
    Halim, Zahid
    KNOWLEDGE AND INFORMATION SYSTEMS, 2022, 64 (06) : 1501 - 1523
  • [29] A graph-based solution for writer identification from handwritten text
    Atta Ur Rahman
    Zahid Halim
    Knowledge and Information Systems, 2022, 64 : 1501 - 1523
  • [30] Boosting Offline Handwritten Text Recognition in Historical Documents With Few Labeled Lines
    Aradillas, Jose Carlos
    Murillo-Fuentes, Juan Jose
    Olmos, Pablo M.
    IEEE ACCESS, 2021, 9 : 76674 - 76688