Text line segmentation of historical documents: a survey

被引:243
|
作者
Likforman-Sulem, Laurence
Zahour, Abderrazak
Taconet, Bruno
机构
[1] Ecole Natl Super Telecommun TSI, GET, F-75013 Paris, France
[2] CNRS, LTCI, F-75013 Paris, France
[3] Univ Havre GED, IUT, F-76610 Le Havre, France
关键词
segmentation; handwriting; text lines; historical documents; survey;
D O I
10.1007/s10032-006-0023-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited electronically. Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such as word spotting, text/image alignment, authentication and extraction of specific fields are in use today. For all these tasks, a major step is document segmentation into text lines. Because of the low quality and the complexity of these documents (background noise, artifacts due to aging, interfering lines), automatic text line segmentation remains an open research field. The objective of this paper is to present a survey of existing methods, developed during the last decade and dedicated to documents of historical interest.
引用
收藏
页码:123 / 138
页数:16
相关论文
共 50 条
  • [31] Robust text line detection in historical documents: learning and evaluation methods
    Boillet, Melodie
    Kermorvant, Christopher
    Paquet, Thierry
    [J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2022, 25 (02) : 95 - 114
  • [32] A two-stage method for text line detection in historical documents
    Gruening, Tobias
    Leifert, Gundram
    Strauss, Tobias
    Michael, Johannes
    Labahn, Roger
    [J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2019, 22 (03) : 285 - 302
  • [33] Robust text line detection in historical documents: learning and evaluation methods
    Mélodie Boillet
    Christopher Kermorvant
    Thierry Paquet
    [J]. International Journal on Document Analysis and Recognition (IJDAR), 2022, 25 : 95 - 114
  • [34] A Combined System for Text Line Extraction and Handwriting Recognition in Historical Documents
    Fischer, Andreas
    Baechler, Micheal
    Garz, Angelika
    Liwicki, Marcus
    Ingold, Rolf
    [J]. 2014 11TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS 2014), 2014, : 71 - 75
  • [35] An effective method for text line segmentation in historical document images
    Tien-Nam Nguyen
    Burie, Jean-Christophe
    Thi-Lan Le
    Schweyer, Anne-Valerie
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1593 - 1599
  • [36] A two-stage method for text line detection in historical documents
    Tobias Grüning
    Gundram Leifert
    Tobias Strauß
    Johannes Michael
    Roger Labahn
    [J]. International Journal on Document Analysis and Recognition (IJDAR), 2019, 22 : 285 - 302
  • [37] A generalized line segmentation method for multi-script handwritten text documents
    Rakshit, Payel
    Halder, Chayan
    Md Obaidullah, Sk
    Roy, Kaushik
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 212
  • [38] Text line segmentation in handwritten documents using Mumford-Shah model
    Du, Xiaojun
    Pan, Wumo
    Bui, Tien D.
    [J]. PATTERN RECOGNITION, 2009, 42 (12) : 3136 - 3145
  • [39] A Multi-scale Text Line Segmentation Method in Freestyle Handwritten Documents
    Gao, Yangdong
    Ding, Xiaoqing
    Liu, Changsong
    [J]. 11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 643 - 647
  • [40] GAN-based text line segmentation method for challenging handwritten documents
    Ozseker, Ibrahim
    Demir, Ali Alper
    Ozkaya, Ufuk
    [J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2024,