A Hybrid Method for Text Line Extraction in Handwritten Document Images

被引:3
|
作者
Kiumarsi, Ehsan [1 ]
Alaei, Alireza [2 ]
机构
[1] Shahid Bahonar Univ Kerman, Fac Elect Engn, Kerman, Iran
[2] Griffith Univ, Griffith Inst Tourism & Sch ICT, Nathan, Qld, Australia
关键词
Text line extraction; handwritten document image; connected component grouping; projection profile analysis; SEGMENTATION;
D O I
10.1109/ICFHR-2018.2018.00050
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text line segmentation in handwritten document image, as one of the preliminarily steps for document image recognition, is a challenging problem. In this paper, a hybrid method for text line extraction in handwritten document images is presented. Initially, a connected component (CC) labelling method following by a CC filtering is employed to extract a set of CCs from the input document image. A new distance measure is introduced to compute normal distances between the extracted CCs. By traversing the normal distance matrix from both the right and left directions, half-chains of CCs are constructed. The CCs half-chains are merged to obtain CCs full-chains. From the extracted full-chains separator lines are obtained. A gradient metric is proposed to detect and remove touching text lines. Using remaining separator lines the adaptive projection profile of the image is computed. Based on the projection profile, coarse text line extraction is performed. Finally, a fine text lines extraction is performed by applying a postprocessing step. To evaluate the method, two benchmarks named ICDAR2013 handwriting segmentation contest, and Kannada datasets composed of handwritten document images in English, Greek, Bengali, and Kannada languages were considered for experimentation. Experimental results indicate a promising performance was obtained compared to some of the state-of-the-art methods.
引用
收藏
页码:241 / 246
页数:6
相关论文
共 50 条
  • [1] Text-line extraction from handwritten document images using GAN
    Kundu, Soumyadeep
    Paul, Sayantan
    Bera, Suman Kumar
    Abraham, Ajith
    Sarkar, Ram
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2020, 140
  • [2] Text Line Extraction in Document Images
    Wang, Liuan
    Fan, Wei
    Sun, Jun
    Naoi, Satshi
    Tanaka, Hiroshi
    [J]. 2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 191 - 195
  • [3] Text Line Extraction of Curved Document Images Using Hybrid Metric
    Huang, Zuming
    Gu, Jie
    Meng, Gaofeng
    Pan, Chunhong
    [J]. PROCEEDINGS 3RD IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION ACPR 2015, 2015, : 251 - 255
  • [4] Distance transform based text-line extraction from unconstrained handwritten document images
    Bera, Suman Kumar
    Kundu, Soumyadeep
    Kumar, Neeraj
    Sarkar, Ram
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 186
  • [5] DENSE PREDICTION FOR TEXT LINE SEGMENTATION IN HANDWRITTEN DOCUMENT IMAGES
    Quang Nhat Vo
    Lee, GueeSang
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 3264 - 3268
  • [6] Text line extraction for historical document images
    Saabni, Raid
    Asi, Abedelkadir
    El-Sana, Jihad
    [J]. PATTERN RECOGNITION LETTERS, 2014, 35 : 23 - 33
  • [7] FAST TEXT LINE EXTRACTION IN DOCUMENT IMAGES
    Ha, Seong Jong
    Jin, Bora
    Cho, Nam Ik
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 797 - 800
  • [8] Text Line Segmentation in Handwritten Document Images Using Tensor Voting
    Toan Dinh Nguyen
    Gueesang Lee
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2011, E94A (11) : 2434 - 2441
  • [9] Text line extraction from handwritten document pages based on line contour estimation
    Sarkar, Ram
    Halder, Sougata
    Malakar, Samir
    Das, Nibaran
    Basu, Subhadip
    Nasipuri, Mita
    [J]. 2012 THIRD INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION & NETWORKING TECHNOLOGIES (ICCCNT), 2012,
  • [10] Text line segmentation from struck-out handwritten document images
    Shivakumara, Palaiahnakote
    Jain, Tanmay
    Pal, Umapada
    Surana, Nitish
    Antonacopoulos, Apostolos
    Lu, Tong
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 210