Document page segmentation and layout analysis using soft ordering

被引:0
|
作者
Mitchell, PE [1 ]
Yan, H [1 ]
机构
[1] Univ Sydney, Sch Elect & Informat Engn, Sydney, NSW 2006, Australia
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a novel algorithm for layout analysis of document images. A major component of this algorithm is the independent segmentation algorithm that identifies text and graphics regions. The segmentation algorithm first locates document patterns and then performs classification using run-length characteristics, spread analysis and adjacency relations. A key feature of the layout analysis algorithm is soft ordering which provides a means of ordering regions in a more logical way, and allows for some overlapping between separate regions. This is very useful for processing documents that are slightly skewed ol irregular ill layout. The algorithm has been tested on many different documents, and can successfully recognise single and multicolumn documents, even when the column format varies several times on one page. Furthermore, it can process documents with text tightly wrapped around graphics and documents that are slightly skewed.
引用
收藏
页码:458 / 461
页数:4
相关论文
共 50 条
  • [1] Document layout extraction using soft ordering
    Mitchell, PE
    Yan, H
    [J]. OPTICAL ENGINEERING, 2002, 41 (11) : 2831 - 2843
  • [2] THE DOCUMENT SPECTRUM FOR PAGE LAYOUT ANALYSIS
    OGORMAN, L
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1993, 15 (11) : 1162 - 1173
  • [3] Document page layout analysis using Harris corner points
    Nourbakhsh, Farshad
    Pati, Peeta Basa
    Ramakrishnan, A. G.
    [J]. FOURTH INTERNATIONAL CONFERENCE ON INTELLIGENT SENSING AND INFORMATION PROCESSSING, PROCEEDINGS, 2006, : 149 - +
  • [4] Page segmentation using document model
    Jain, AK
    Yu, B
    [J]. PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, 1997, : 34 - 38
  • [5] Page segmentation for document image analysis using a neural network
    Patel, D
    [J]. OPTICAL ENGINEERING, 1996, 35 (07) : 1854 - 1861
  • [6] DOCUMENT IMAGE SEGMENTATION AND LAYOUT ANALYSIS
    SAITOH, T
    YAMAAI, T
    TACHIKAWA, M
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1994, E77D (07) : 778 - 784
  • [7] Extending Page Segmentation Algorithms for Mixed-Layout Document Processing
    Winder, Amy
    Andersen, Tim
    Smith, Elisa H. Barney
    [J]. 11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 1245 - 1249
  • [8] Segmentation for document layout analysis: not dead yet
    Logan Markewich
    Hao Zhang
    Yubin Xing
    Navid Lambert-Shirzad
    Zhexin Jiang
    Roy Ka-Wei Lee
    Zhi Li
    Seok-Bum Ko
    [J]. International Journal on Document Analysis and Recognition (IJDAR), 2022, 25 : 67 - 77
  • [9] Segmentation for document layout analysis: not dead yet
    Markewich, Logan
    Zhang, Hao
    Xing, Yubin
    Lambert-Shirzad, Navid
    Jiang, Zhexin
    Lee, Roy Ka-Wei
    Li, Zhi
    Ko, Seok-Bum
    [J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2022, 25 (02) : 67 - 77
  • [10] DeepLayout: A Semantic Segmentation Approach to Page Layout Analysis
    Li, Yixin
    Zou, Yajun
    Ma, Jinwen
    [J]. INTELLIGENT COMPUTING METHODOLOGIES, ICIC 2018, PT III, 2018, 10956 : 266 - 277