A document image segmentation system using analysis of connected components

被引:9
|
作者
Zirari, F. [1 ]
Ennaji, A. [1 ]
Nicolas, S. [1 ]
Mammass, D. [2 ]
机构
[1] Univ Rouen, LITIS Lab, Rouen, France
[2] Ibn Zohr Univ, IRF SIC Lab, Agadir, Morocco
关键词
text/non-text separating; connected components; graph; structural analysis; document image;
D O I
10.1109/ICDAR.2013.154
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Page segmentation into text and non-text elements is an essential preprocessing step before optical character recognition (OCR) operation. In case of poor segmentation, an OCR classification engine produces garbage characters due to the presence of non-text elements. This paper presents a method to separate the textual and non textual components in document images using a graph-based modeling and structural analysis. This is a fast and efficient method to separate adequately the graphical and the textual parts of a document. We have evaluated our method on two well-known subsets: the UW-III dataset and the ICDAR 2009 page segmentation competition dataset. Comparisons are led with two methods of state-of-the-art; these results showing that our method proved better performances in this task.
引用
收藏
页码:753 / 757
页数:5
相关论文
共 50 条
  • [31] Document image segmentation and quality improvement by moire pattern analysis
    Yang, JCY
    Tsai, WH
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2000, 15 (09) : 781 - 797
  • [32] Comparison of image segmentation of lungs using methods: connected threshold, neighborhood connected, and threshold level set segmentation
    Amanda, A. R.
    Widita, R.
    13TH SOUTH-EAST ASIAN CONGRESS OF MEDICAL PHYSICS 2015 (SEACOMP), 2016, 694
  • [33] GPU Accelerated Fuzzy Connected Image Segmentation by using CUDA
    Zhuge, Ying
    Cao, Yong
    Miller, Robert W.
    2009 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-20, 2009, : 6341 - +
  • [34] Image segmentation using a mixture of principal components representation
    Dony, RD
    Haykin, S
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1997, 144 (02): : 73 - 80
  • [35] Using kernel principal components for color image segmentation
    Wesolkowski, S
    APPLICATIONS OF DIGITAL IMAGE PROCESSING XXV, 2002, 4790 : 1 - 10
  • [36] A robust document processing system combining image segmentation with content-based document compression
    Yang, YB
    Yan, H
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS: APPLICATIONS, ROBOTICS SYSTEMS AND ARCHITECTURES, 2000, : 519 - 522
  • [37] AN INTEGRATED IMAGE SEGMENTATION IMAGE-ANALYSIS SYSTEM
    IRWIN, PDS
    WILKINSON, AJ
    LECTURE NOTES IN COMPUTER SCIENCE, 1988, 301 : 26 - 37
  • [38] Probabilistic homogeneity for document image segmentation
    Lu, Tan
    Dooms, Ann
    PATTERN RECOGNITION, 2021, 109
  • [39] Morphological Connected Openings to Image Segmentation
    Mendiola-Santibanez, Jorge D.
    Ortega-Bucio, Lidia G.
    Terol-Villalobos, Ivan
    Santillan, Israel
    2010 IEEE ELECTRONICS, ROBOTICS AND AUTOMOTIVE MECHANICS CONFERENCE (CERMA 2010), 2010, : 422 - 427
  • [40] Image segmentation by connected parametrical models
    Bukovec, M
    Truyen, R
    Likar, B
    Bernard, R
    Pernus, F
    MEDICAL IMAGING 2004: IMAGE PROCESSING, PTS 1-3, 2004, 5370 : 398 - 409