Document Layout Analysis using Multigaussian Fitting

被引:5
|
作者
Melinda, Laiphangbam [1 ]
Ghanapuram, Raghu [1 ]
Bhagvati, Chakravarthy [1 ]
机构
[1] Univ Hyderabad, Sch Comp & Informat Sci, Hyderabad 500046, Telangana, India
关键词
Document Layout Analysis; Bounding Boxes; Height Histogram; Multigaussian; Nearest Neighbor; PAGE SEGMENTATION; EXTRACTION; SYSTEM;
D O I
10.1109/ICDAR.2017.127
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a novel technique for layout analysis of documents with complex Manhattan layouts. The technique is designed for Indic script newspapers and works on many types of documents not necessarily with Indic scripts with Manhattan layout. The main idea behind the algorithm is to categorise the physical elements of a document into noise, text, titles and graphics based on their heights. A histogram of heights is computed from the bounding boxes of connected components and a multigaussian fit is used to discover optimal split points between the categories. The gaussian with the highest peak is assumed to correspond to running text. Running text regions are grouped into blocks using nearest neighbour analysis. These initial regions are further refined using a second-level classification of the other elements into graphics, light-coloured text on a dark background, and graphical separators. The resulting layouts show accuracies comparable to some of the best and most popular algorithms such as MHS (winner of ICDAR-RDCL2015 competition) and PRImA's Aletheia (tool developed by PRImA Research Lab). Results of testing on many Indic script newspapers and other documents, and comparison with Aletheia and MHS on ICDAR dataset show its performance. Our initial results on an Indic document dataset show high performance in identifying running text (> 98%) with an accuracy of 82% on identifying the other elements. Ground truth data for the Indic script newspaper documents is being generated for a more extensive quantitative testing. The strength of our algorithm is that it requires only one parameter - the number of gaussians to fit the height histogram data and is therefore easy to automate and adapt to many documents.
引用
收藏
页码:747 / 752
页数:6
相关论文
共 50 条
  • [1] Document layout analysis using pattern classification method
    Yamaoka, M
    Iwaki, O
    [J]. IMAGE ANALYSIS APPLICATIONS AND COMPUTER GRAPHICS, 1995, 1024 : 524 - 525
  • [2] Cross-domain document layout analysis using document style guide
    Wu, Xingjiao
    Xiao, Luwei
    Du, Xiangcheng
    Zheng, Yingbin
    Li, Xin
    Ma, Tianlong
    Jin, Cheng
    He, Liang
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 245
  • [3] Arabic document layout analysis
    Amany M. Hesham
    Mohsen A. A. Rashwan
    Hassanin M. Al-Barhamtoshy
    Sherif M. Abdou
    Amr A. Badr
    Ibrahim Farag
    [J]. Pattern Analysis and Applications, 2017, 20 : 1275 - 1287
  • [4] Arabic document layout analysis
    Hesham, Amany M.
    Rashwan, Mohsen A. A.
    Al-Barhamtoshy, Hassanin M.
    Abdou, Sherif M.
    Badr, Amr A.
    Farag, Ibrahim
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2017, 20 (04) : 1275 - 1287
  • [5] Document page segmentation and layout analysis using soft ordering
    Mitchell, PE
    Yan, H
    [J]. 15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS: COMPUTER VISION AND IMAGE ANALYSIS, 2000, : 458 - 461
  • [6] Chinese document layout analysis using an adaptive regrouping strategy
    Chang, F
    Chu, SY
    Chen, CY
    [J]. PATTERN RECOGNITION, 2005, 38 (02) : 261 - 271
  • [7] Document page layout analysis using Harris corner points
    Nourbakhsh, Farshad
    Pati, Peeta Basa
    Ramakrishnan, A. G.
    [J]. FOURTH INTERNATIONAL CONFERENCE ON INTELLIGENT SENSING AND INFORMATION PROCESSSING, PROCEEDINGS, 2006, : 149 - +
  • [8] Adaptive layout analysis of document images
    Malerba, D
    Esposito, F
    Altamura, O
    [J]. FOUNDATIONS OF INTELLIGENT SYSTEMS, PROCEEDINGS, 2002, 2366 : 526 - 534
  • [9] Local Descriptors for Document Layout Analysis
    Garz, Angelika
    Diem, Markus
    Sablatnig, Robert
    [J]. ADVANCES IN VISUAL COMPUTING, PT III, 2010, 6455 : 29 - 38
  • [10] THE DOCUMENT SPECTRUM FOR PAGE LAYOUT ANALYSIS
    OGORMAN, L
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1993, 15 (11) : 1162 - 1173