Text line segmentation of historical documents: a survey

被引:243
|
作者
Likforman-Sulem, Laurence
Zahour, Abderrazak
Taconet, Bruno
机构
[1] Ecole Natl Super Telecommun TSI, GET, F-75013 Paris, France
[2] CNRS, LTCI, F-75013 Paris, France
[3] Univ Havre GED, IUT, F-76610 Le Havre, France
关键词
segmentation; handwriting; text lines; historical documents; survey;
D O I
10.1007/s10032-006-0023-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited electronically. Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such as word spotting, text/image alignment, authentication and extraction of specific fields are in use today. For all these tasks, a major step is document segmentation into text lines. Because of the low quality and the complexity of these documents (background noise, artifacts due to aging, interfering lines), automatic text line segmentation remains an open research field. The objective of this paper is to present a survey of existing methods, developed during the last decade and dedicated to documents of historical interest.
引用
收藏
页码:123 / 138
页数:16
相关论文
共 50 条
  • [1] Text line segmentation of historical documents: a survey
    Laurence Likforman-Sulem
    Abderrazak Zahour
    Bruno Taconet
    [J]. International Journal of Document Analysis and Recognition (IJDAR), 2007, 9 : 123 - 138
  • [2] Text Line segmentation of historical Arabic documents
    Zahour, Abderrazak
    Likforman-Sulem, Laurence
    Boussalaa, Wafa
    Taconet, Bruno
    [J]. ICDAR 2007: NINTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2007, : 138 - +
  • [3] Text Line Segmentation in Images of Handwritten Historical Documents
    Sanchez, A.
    Suarez, P. D.
    Melloz, C. A. B.
    Oliveira, A. L. I.
    Alves, V. M. O.
    [J]. 2008 FIRST INTERNATIONAL WORKSHOPS ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS (IPTA), 2008, : 232 - +
  • [4] A Robust Hybrid Approach for Text Line Segmentation in Historical Documents
    Clausner, Christian
    Antonacopoulos, Apostolos
    Pletschacher, Stefan
    [J]. 2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 335 - 338
  • [5] A Multilevel Text line Segmentation Framework for Handwritten Historical Documents
    Ben Messaoud, Ines
    Amiri, Hamid
    El Abed, Haikal
    Maergner, Volker
    [J]. 13TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2012), 2012, : 515 - 520
  • [6] Reducing the Human Effort in Text Line Segmentation for Historical Documents
    Granell, Emilio
    Quiros, Lorenzo
    Romero, Veronica
    Andreu Sanchez, Joan
    [J]. DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT III, 2021, 12823 : 523 - 537
  • [7] Learning-Free Text Line Segmentation for Historical Handwritten Documents
    Barakat, Berat Kurar
    Cohen, Rafi
    Droby, Ahmad
    Rabaev, Irina
    El-Sana, Jihad
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (22): : 1 - 19
  • [8] Segmentation of Historical Handwritten Documents into Text Zones and Text Lines
    Gatos, Basilis
    Louloudis, Georgios
    Stamatopoulos, Nikolaos
    [J]. 2014 14TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2014, : 464 - 469
  • [9] Text Line Segmentation of Tibetan Historical Documents Based on Text Core Regions Combined with Expansion Growth
    Li Jincheng
    Wang Xiaojuan
    Wang Weilan
    Lin Qiang
    Hu Pengfei
    [J]. LASER & OPTOELECTRONICS PROGRESS, 2021, 58 (02)
  • [10] Research on Text Line Segmentation of Historical Tibetan Documents Based on the Connected Component Analysis
    Wang, Yiqun
    Wang, Weilan
    Li, Zhenjiang
    Han, Yuehui
    Wang, Xiaojuan
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PT III, 2018, 11258 : 74 - 87