Clustering schemaless XML documents

被引:0
|
作者
Shen, Y [1 ]
Wang, B [1 ]
机构
[1] Univ Hull, Dept Comp Sci, Kingston Upon Hull HU6 7RX, N Humberside, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the issue of semantically clustering the increasing number of the schemaless XML documents. In our approach, each document in a document collection is firstly represented by a macro-path sequence. Secondly, the similarity matrix for a document collection is constructed by computing the similarity value among these macro-path sequences. Finally, the desired clusters are constructed by utilizing the hierarchical clustering technique. Experimental results are also shown in this paper.
引用
收藏
页码:767 / 784
页数:18
相关论文
共 50 条
  • [1] Clustering of XML documents
    Guillaume, D
    Murtagh, F
    [J]. COMPUTER PHYSICS COMMUNICATIONS, 2000, 127 (2-3) : 215 - 227
  • [2] Clustering XML documents by structure
    Dalamagas, T
    Cheng, T
    Winkel, KJ
    Sellis, T
    [J]. METHODS AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3025 : 112 - 121
  • [3] Clustering XML Documents by Structure
    Lesniewska, Anna
    [J]. ADVANCES IN DATABASES AND INFORMATION SYSTEMS, 2010, 5968 : 238 - 246
  • [4] XML documents clustering by structures
    Nayak, Richi
    Xu, Sumei
    [J]. ADVANCES IN XML INFORMATION RETRIEVAL AND EVALUATION, 2006, 3977 : 432 - 442
  • [5] Semantic Clustering of XML Documents
    Tagarelli, Andrea
    Greco, Sergio
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2010, 28 (01)
  • [6] Collaborative clustering of XML documents
    Greco, Sergio
    Gullo, Francesco
    Ponti, Giovanni
    Tagarelli, Andrea
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2011, 77 (06) : 988 - 1008
  • [7] Collaborative Clustering of XML Documents
    Greco, Sergio
    Gullo, Francesco
    Ponti, Giovanni
    Tagarelli, Andrea
    [J]. 2009 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPPW 2009), 2009, : 579 - 586
  • [8] Clustering XML documents by patterns
    Piernik, Maciej
    Brzezinski, Dariusz
    Morzy, Tadeusz
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 46 (01) : 185 - 212
  • [9] Clustering XML documents by patterns
    Maciej Piernik
    Dariusz Brzezinski
    Tadeusz Morzy
    [J]. Knowledge and Information Systems, 2016, 46 : 185 - 212
  • [10] Multisets and clustering XML documents
    Iyer, Swami
    Simovici, Dan A.
    [J]. 19TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL I, PROCEEDINGS, 2007, : 267 - 274