A fast algorithm for bottom-up document layout analysis

被引:85
|
作者
Simon, A
Pret, JC
Johnson, AP
机构
[1] Institute for Computer Applications in Molecular Sciences, School of Chemistry, University of Leeds, Leeds
关键词
document analysis; physical page layout; bottom-up layout analysis; Kruskal's algorithm; spanning tree; chemical documents;
D O I
10.1109/34.584106
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a new bottom-up method for document layout analysis. The algorithm was implemented in the GLIDE (Chemical Literature Data Extraction) system (http://chem.leeds.ac.uk/ICAMS/CLiDE.html) but the method described here is suitable for a broader range of documents. It is based on Kruskal's algorithm and uses a special distance-metric between the components to construct the physical page structure. The method has all the major advantages of bottom-up systems: independence from different text spacing and independence from different block alignments. The algorithms computational complexity is reduced to linear by using heuristics and path-compression.
引用
下载
收藏
页码:273 / 277
页数:5
相关论文
共 50 条
  • [1] Bottom-up layout generation
    Hower, Walter
    Informatica (Ljubljana), 1996, 20 (01): : 57 - 63
  • [2] A FLEXIBLE BOTTOM-UP APPROACH FOR LAYOUT GENERATION
    VANLIEROP, MLP
    INTEGRATION-THE VLSI JOURNAL, 1985, 3 (01) : 49 - 59
  • [3] Description and analysis of a bottom-up DFA minimization algorithm
    Almeida, Jorge
    Zeitoun, Marc
    INFORMATION PROCESSING LETTERS, 2008, 107 (02) : 52 - 59
  • [4] Bottom-Up Shape Analysis
    Gulavani, Bhargav S.
    Chakraborty, Supratik
    Ramalingam, Ganesan
    Nori, Aditya V.
    STATIC ANALYSIS, 2009, 5673 : 188 - +
  • [5] A bottom-up summarization algorithm for videos in the wild
    Gang Pan
    Yaoxian Zheng
    Rufei Zhang
    Zhenjun Han
    Di Sun
    Xingming Qu
    EURASIP Journal on Advances in Signal Processing, 2019
  • [6] A bottom-up summarization algorithm for videos in the wild
    Pan, Gang
    Zheng, Yaoxian
    Zhang, Rufei
    Han, Zhenjun
    Sun, Di
    Qu, Xingming
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2019, 2019 (1)
  • [7] A BOTTOM-UP ADAPTATION OF EARLEY PARSING ALGORITHM
    VOISIN, F
    LECTURE NOTES IN COMPUTER SCIENCE, 1989, 348 : 146 - 160
  • [8] AN ENHANCED BOTTOM-UP ALGORITHM FOR FLOORPLAN DESIGN
    MUELLER, TR
    WONG, DF
    LIU, CL
    INTEGRATION-THE VLSI JOURNAL, 1989, 7 (02) : 189 - 201
  • [9] A bottom-up algorithm for XML twig queries
    Zhi-xian, Tang
    Jun, Feng
    Li-ming, Xu
    Ya-qing, Shi
    International Journal of Database Theory and Application, 2015, 8 (04): : 49 - 58
  • [10] Bottom-up document segmentation method based on textural features
    Vil'kin A.M.
    Safonov I.V.
    Egorova M.A.
    Pattern Recognition and Image Analysis, 2011, 21 (3) : 565 - 568