Content-Structure Correspondence: A Generic Representation for Heterogeneous Structured Document

被引:0
|
作者
Tan, Saravadee Sae [1 ]
Tang, Enya Kong [1 ]
Ranaivo-Malancon, Bali [1 ]
机构
[1] Multimedia Univ, Fac Informat Technol, Selangor, Malaysia
关键词
Parsing; Subcategorization; PP attachment; Coordination attachment; Text understanding; Grammar writing;
D O I
10.1016/j.sbspro.2011.10.602
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This on the web, most structured document collections consist of documents from different sources and marked up with different types of structures. The diversity of structures has lead to the emergence of heterogeneous structured documents. The heterogeneity of structured documents poses new challenges for document representation in structured document retrieval. The representation model needs to handle various types of structures as well as multiple structures in a single document. Furthermore, same information may be represented in different structures and information contained in different documents may be partial and inconsistent. Therefore, the linkage of semantically related elements in the document collections needs to be modelled in the representation model. In this paper, we introduce a generic and flexible structured document model to represent heterogeneous structured documents as well as the similar correspondences in the document collections. (C) 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of PACLING Organizing Committee.
引用
收藏
页码:226 / 232
页数:7
相关论文
共 50 条
  • [1] Uniform Representation of Content and Structure for structured document retrieval
    Lalmas, M
    [J]. RESEARCH AND DEVELOPMENT IN INTELLIGENT SYSTEMS XVII, 2001, : 215 - 228
  • [2] On Joint Representation Learning of Network Structure and Document Content
    Schloetterer, Joerg
    Seifert, Christin
    Granitzer, Michael
    [J]. MACHINE LEARNING AND KNOWLEDGE EXTRACTION, CD-MAKE 2017, 2017, 10410 : 237 - 251
  • [3] Detecting individual content-structure patterns in time series data
    Feng, Lu
    Xu, Xianyang
    Yuan, Hua
    Zhang, Qian
    [J]. 2016 13TH INTERNATIONAL CONFERENCE ON SERVICE SYSTEMS AND SERVICE MANAGEMENT, 2016,
  • [4] Intra-Firm Information Flow: A Content-Structure Perspective
    Berchenko, Yakir
    Daliot, Or
    Brueller, Nir N.
    [J]. ADVANCES IN INTELLIGENT DATA ANALYSIS X: IDA 2011, 2011, 7014 : 34 - +
  • [5] Local structured representation for generic object detection
    Junge Zhang
    Kaiqi Huang
    Tieniu Tan
    Zhaoxiang Zhang
    [J]. Frontiers of Computer Science, 2017, 11 : 632 - 648
  • [6] Local structured representation for generic object detection
    Zhang, Junge
    Huang, Kaiqi
    Tan, Tieniu
    Zhang, Zhaoxiang
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2017, 11 (04) : 632 - 648
  • [7] A ROBUST THRESHOLDING TECHNIQUE FOR GENERIC STRUCTURED DOCUMENT CLASSIFIER USING ORDINAL STRUCTURE FUZZY LOGIC
    Mokayed, Hamam
    Mohamed, Azlinah Hj.
    [J]. INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2014, 10 (04): : 1543 - 1554
  • [8] Representation of similarities and correspondence structure
    Intrator, N
    [J]. BEHAVIORAL AND BRAIN SCIENCES, 1998, 21 (04) : 475 - +
  • [9] XML CLUSTERING FRAMEWORK BASED ON DOCUMENT CONTENT AND STRUCTURE IN A HETEROGENEOUS DIGITAL LIBRARY
    Samadi, Nafisse
    Ravana, Sri Devi
    [J]. MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2023, 36 (02) : 124 - 147
  • [10] XML Document Clustering Using Structure-Preserving Flat Representation of XML Content and Structure
    Hadzic, Fedja
    Hecker, Michael
    Tagarelli, Andrea
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PT II, 2011, 7121 : 403 - +