Page segmentation using document model

被引:0
|
作者
Jain, AK
Yu, B
机构
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transforming a paper document to its electronic version in a form suitable for efficient storage, retrieval and interpretation continues to be a challenging problem. An efficient document model is necessary to solve this problem. Document modeling involves techniques of thresholding, skew detection, geometric layout analysis and logical layout analysis, The derived model can then be used in document storage and retrieval, fn this paper, we use the traditional bottom-up approach based on the connected component extraction to efficiently implement page segmentation and region identification. A new document model which preserves top-down generation information is proposed based on which a document is logically represented for interactive editing, storage, retrieval, transfer and logical analysis.
引用
收藏
页码:34 / 38
页数:5
相关论文
共 50 条
  • [31] Document cleanup using page frame detection
    Faisal Shafait
    Joost van Beusekom
    Daniel Keysers
    Thomas M. Breuel
    International Journal of Document Analysis and Recognition (IJDAR), 2008, 11 : 81 - 96
  • [32] Document text segmentation using multi-band disc model
    Tan, CL
    Yuan, B
    DOCUMENT RECOGNITION AND RETRIEVAL VIII, 2001, 4307 : 212 - 222
  • [33] Page segmentation using thinning of white areas
    Kise, Koichi
    Yanagida, Osamu
    Systems and Computers in Japan, 1998, 29 (03): : 59 - 68
  • [34] DSS_DOM: a new page segmentation model
    Li, Cunhe
    Xu, Chao
    Liu, Kangwei
    Journal of Information and Computational Science, 2009, 6 (02): : 1025 - 1032
  • [35] Page Segmentation for Historical Document Images Based on Superpixel Classification with Unsupervised Feature Learning
    Chen, Kai
    Liu, Cheng-Lin
    Seuret, Mathias
    Liwicki, Marcus
    Hennebert, Jean
    Ingold, Rolf
    PROCEEDINGS OF 12TH IAPR WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS, (DAS 2016), 2016, : 299 - 304
  • [36] Automated detection and segmentation of table of contents page and index pages from document images
    Mandal, S
    Chowdhury, SP
    Das, AK
    Chanda, B
    12TH INTERNATIONAL CONFERENCE ON IMAGE ANALYSIS AND PROCESSING, PROCEEDINGS, 2003, : 213 - 218
  • [37] Pink panther: A complete environment for ground-truthing and benchmarking document page segmentation
    Yanikoglu, BA
    Vincent, L
    PATTERN RECOGNITION, 1998, 31 (09) : 1191 - 1204
  • [38] Text extraction and document image segmentation using matched wavelets and MRF model
    Kumar, Sunil
    Gupta, Rajat
    Khanna, Nitin
    Chaudhury, Santanu
    Joshi, Shiv Dutt
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2007, 16 (08) : 2117 - 2128
  • [39] MULTISCALE SEGMENTATION FOR MRC DOCUMENT COMPRESSION USING A MARKOV RANDOM FIELD MODEL
    Haneda, Eri
    Bouman, Charles A.
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 1042 - 1045
  • [40] A Joint Model for Document Segmentation and Segment Labeling
    Barrow, Joe
    Jain, Rajiv
    Morariu, Vlad, I
    Manjunatha, Varun
    Oard, Douglas W.
    Resnik, Philip
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 313 - 322