A hierarchical representation of form documents for identification and retrieval

被引:25
|
作者
Pınar Duygulu
Volkan Atalay
机构
[1] Department of Computer Engineering,
[2] Middle East Technical University,undefined
[3] Ankara,undefined
[4] 06531 Turkey; e-mail: {duygulu,undefined
[5] volkan}@ceng.metu.edu.tr ,undefined
关键词
Keywords: Form document processing – Logical layout extraction – Retrieval – Data processing;
D O I
10.1007/s100320100077
中图分类号
学科分类号
摘要
In this paper, we present a logical representation for form documents to be used for identification and retrieval. A hierarchical structure is proposed to represent the structure of a form by using lines and the XY-tree approach. The approach is top-down and no domain knowledge such as the preprinted data or filled-in data is used. Geometrical modifications and slight variations are handled by this representation. Logically identical forms are associated to the same or similar hierarchical structure. Identification and the retrieval of similar forms are performed by computing the edit distances between the generated trees.
引用
收藏
页码:17 / 27
页数:10
相关论文
共 50 条
  • [1] A hierarchical representation of form documents for identification and retrieval
    Duygulu, P
    Atalay, V
    DOCUMENT RECOGNITION AND RETRIEVAL VII, 2000, 3967 : 128 - 139
  • [2] A heuristic algorithm for hierarchical representation of form documents
    Duygulu, P
    Atalay, V
    Dincel, E
    FOURTEENTH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1 AND 2, 1998, : 929 - 931
  • [3] An Ontological Representation of Documents and Queries for Information Retrieval Systems
    Diagoni, Mauro
    Pereira, Celia Da Costa
    Tettamanzi, Andrea G. B.
    TRENDS IN APPLIED INTELLIGENT SYSTEMS, PT II, PROCEEDINGS, 2010, 6097 : 555 - 564
  • [4] The documents selection method for information retrieval results representation
    Dubinskij, A.G.
    Problemy Upravleniya I Informatiki (Avtomatika), 2002, (01): : 107 - 114
  • [5] Exploiting the semantic graph for the representation and retrieval of medical documents
    Zhao, Qing
    Kang, Yangyang
    Li, Jianqiang
    Wang, Dan
    COMPUTERS IN BIOLOGY AND MEDICINE, 2018, 101 : 39 - 50
  • [6] Hierarchical model for web multimedia documents retrieval and periodical updates
    Habib, Sami
    Safar, Maytham
    INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2008, 4 (01) : 78 - 96
  • [7] CONTROLLING RETRIEVAL THROUGH A USER-ADAPTIVE REPRESENTATION OF DOCUMENTS
    BORDOGNA, G
    PASI, G
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 1995, 12 (3-4) : 317 - 339
  • [8] A fuzzy representation of HTML']HTML documents for information retrieval systems
    Molinari, A
    Pasi, G
    FUZZ-IEEE '96 - PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, 1996, : 107 - 112
  • [9] Semantic trajectory representation and retrieval via hierarchical embedding
    Gao, Chongming
    Zhang, Zhong
    Huang, Chen
    Yin, Hongzhi
    Yang, Qinli
    Shao, Junming
    INFORMATION SCIENCES, 2020, 538 : 176 - 192
  • [10] A hierarchical representation for content-based image retrieval
    Distasi, R
    Vitulano, D
    Vitulano, S
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2000, 11 (04): : 369 - 382