Script-independent text line segmentation in freestyle handwritten documents

被引:120
|
作者
Li, Yi [1 ]
Zheng, Yefeng [2 ]
Doermann, David
Jaeger, Stefan [1 ,3 ]
机构
[1] Univ Maryland, Inst Adv Comp Studies, Language & Media Proc Lab, College Pk, MD 20742 USA
[2] Siemens Corp Res, Princeton, NJ 08540 USA
[3] Partner Inst Computat Biol, Grp Syst Bioinformat, CAS MPG, Shanghai 200031, Peoples R China
关键词
handwritten text line segmentation; document image analysis; density estimation; level set methods;
D O I
10.1109/TPAMI.2007.70792
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text line segmentation in freestyle handwritten documents remains an open document analysis problem. Curvilinear text lines and small gaps between neighboring text lines present a challenge to algorithms developed for machine-printed or hand-printed documents. In this paper, we propose a novel approach based on density estimation and a state-of-the-art image segmentation technique, the level set method. From an input document image, we estimate a probability map where each element represents the probability of the underlying pixel belonging to a text line. The level set method is then exploited to determine the boundary of neighboring text lines by evolving an initial estimate. Unlike connected component-based methods ([1] and [2], for example), the proposed algorithm does not use any script-specific knowledge. Extensive quantitative experiments on freestyle handwritten documents with diverse scripts such as Arabic, Chinese, Korean, and Hindi demonstrate that our algorithm consistently outperforms previous methods [1], [2], [3]. Further experiments show that the proposed algorithm is robust to scale change, rotation, and noise.
引用
收藏
页码:1313 / 1329
页数:17
相关论文
共 50 条
  • [41] Robust base-line independent algorithms for segmentation and reconstruction of Arabic handwritten cursive script
    Mostafa, K
    Darwish, AM
    DOCUMENT RECOGNITION AND RETRIEVAL VI, 1999, 3651 : 73 - 83
  • [42] Influence of Text Line Segmentation in Handwritten Text Recognition
    Romero, Veronica
    Andreu Sanchez, Joan
    Bosch, Vicente
    Depuydt, Katrien
    de Does, Jesse
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 536 - 540
  • [43] Text line segmentation in indian ancient handwritten documents using faster R-CNN
    Amar Jindal
    Rajib Ghosh
    Multimedia Tools and Applications, 2023, 82 : 10703 - 10722
  • [44] Handwritten Text Line Segmentation by Spectral Clustering
    Han, Xuecheng
    Yao, Hui
    Zhong, Guoqiang
    EIGHTH INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2016), 2017, 10225
  • [45] Text Line Segmentation Based on Matched Filtering and Top-down Grouping for Handwritten Documents
    Tang, Youbao
    Wu, Xiangqian
    Bu, Wei
    2014 11TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS 2014), 2014, : 365 - 369
  • [46] Graph-based ensemble method for text line segmentation in offline Chinese handwritten documents
    Huang, L. (huangliang1576@gmail.com), 1600, Huazhong University of Science and Technology (42):
  • [47] An Improved Handwritten Text Line Segmentation Technique
    Mohammadi, M.
    Chanijani, S. S. Mozaffari
    Aradhya, V. N. Manjunath
    Kumar, G. H.
    ADVANCES IN COMPUTING AND COMMUNICATIONS, PT III, 2011, 192 : 289 - +
  • [48] Text line segmentation in indian ancient handwritten documents using faster R-CNN
    Jindal, Amar
    Ghosh, Rajib
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (07) : 10703 - 10722
  • [49] A Robust Text Line Detection in Complex Handwritten Documents
    Pach, Jakub Leszek
    Bilski, Piotr
    2015 IEEE 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS: TECHNOLOGY AND APPLICATIONS (IDAACS), VOLS 1-2, 2015, : 271 - 275
  • [50] GAN-based text line segmentation method for challenging handwritten documentsGAN-based text line segmentation method for challenging handwritten documentsİ Özşeker et al.
    İbrahim Özşeker
    Ali Alper Demir
    Ufuk Özkaya
    International Journal on Document Analysis and Recognition (IJDAR), 2025, 28 (1): : 59 - 69