Clustering XML documents by structure based on common neighbor

被引:0
|
作者
Zhang, XZ [1 ]
Lv, TY
Wang, ZX
Zuo, WL
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130023, Peoples R China
[2] Harbin Engn Univ, Coll Comp Sci & Technol, Harbin, Peoples R China
关键词
XML structure; clustering; common neighbor;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is important to perform the clustering task on XML documents. However, it is difficult to select the appropriate parameters' value for the clustering algorithms. Meanwhile, current clustering algorithms lack the effective mechanism to detect outliers while treating outliers as "noise". By integrating outlier detection with clustering, the paper takes a new approach for analyzing the XML documents by structure. After stating the concept of common neighbor based outlier, the paper proposes a new clustering algorithm, which stops clustering automatically by utilizing the outlier information and needs only one parameter, whose appropriate value range is decided in the outlier mining process. After discussing some features of the proposed algorithm, the paper adopts the XML dataset with different structure and other real-life datasets to compare it with other clustering algorithms.
引用
收藏
页码:771 / 776
页数:6
相关论文
共 50 条
  • [1] XML clustering based on common neighbor
    Lv, TY
    Zhang, XZ
    Zuo, WL
    Wang, ZX
    [J]. ADVANCED WEB AND NETWORK TECHNOLOGIES, AND APPLICATIONS, PROCEEDINGS, 2006, 3842 : 137 - 141
  • [2] A weighted common structure based clustering technique for XML documents
    Hwang, Jeong Hee
    Ryu, Keun Ho
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2010, 83 (07) : 1267 - 1274
  • [3] All common embedded subtrees for clustering XML documents by structure
    Lin, Zhiwei
    Wang, Hui
    McClean, Sally
    Wang, Haiying
    [J]. PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 13 - 18
  • [4] Clustering XML documents by structure
    Dalamagas, T
    Cheng, T
    Winkel, KJ
    Sellis, T
    [J]. METHODS AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3025 : 112 - 121
  • [5] Clustering XML Documents by Structure
    Lesniewska, Anna
    [J]. ADVANCES IN DATABASES AND INFORMATION SYSTEMS, 2010, 5968 : 238 - 246
  • [6] Clustering of XML Documents Based on Structure and Aggregated Content
    Rezk, Nermeen Gamal
    Sarhan, Amany
    Algergawy, Alsaved
    [J]. PROCEEDINGS OF 2016 11TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2016, : 93 - 102
  • [7] A methodology for clustering XML documents by structure
    Dalamagas, T
    Cheng, T
    Winkel, KJ
    Sellis, T
    [J]. INFORMATION SYSTEMS, 2006, 31 (03) : 187 - 228
  • [8] Clustering and retrieval of XML documents by structure
    Hwang, JH
    Ryu, KH
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2005, PT 2, 2005, 3481 : 925 - 935
  • [9] XCLSC: Structure and Content-based Clustering of XML Documents
    Bessine, Karima
    Nehar, Attia
    Cherroun, Hadda
    Moussaoui, Abdelouahab
    [J]. 2015 12TH IEEE INTERNATIONAL CONFERENCE ON PROGRAMMING AND SYSTEMS (ISPS), 2015, : 221 - 227
  • [10] A tree-based approach to clustering XML documents by structure
    Costa, G
    Manco, G
    Ortale, R
    Tagarelli, A
    [J]. KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2004, PROCEEDINGS, 2004, 3202 : 137 - 148