Semantic Structural Similarity for Clustering XML Documents

被引:3
|
作者
Kim, Tae-Soon [1 ]
Lee, Ju-Hong [1 ]
Song, Jae-Won [1 ]
机构
[1] Inha Univ, Sch Engn & Comp Sci, Inchon, South Korea
关键词
D O I
10.1109/ICHIT.2008.183
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The amount of XML documents is increasing rapidly. In order to analyze the information represented in XML documents efficiently, researches on AM document clustering are actively in progress. The key issue is how to devise the similarity measure between AM documents to be used for clustering. Since XML documents have hierarchical structure, it is not appropriate to cluster them by using a general document similarity measure. Previous works on similarity measure for XML document clustering have no consideration for the semantic information as they consider only the structural information. In this paper, we propose the novel similarity measure that concurrently considers both structural and semantic information of XML document. Our experiments show that the proposed method improve accuracy on the clustering from the semantic point of view, compared to the previous works.
引用
收藏
页码:552 / 557
页数:6
相关论文
共 50 条
  • [1] Semantic Structural Similarity Measure for Clustering XML Documents
    Song, Ling
    Ma, Jun
    Lei, Jingsheng
    Zhang, Dongmei
    Wang, Zhen
    [J]. WEB INFORMATION SYSTEMS AND MINING, PROCEEDINGS, 2009, 5854 : 232 - +
  • [2] Using structural similarity for clustering XML documents
    Aitelhadj, Ali
    Boughanem, Mohand
    Mezghiche, Mohamed
    Souam, Fatiha
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 32 (01) : 109 - 139
  • [3] Using structural similarity for clustering XML documents
    Ali Aïtelhadj
    Mohand Boughanem
    Mohamed Mezghiche
    Fatiha Souam
    [J]. Knowledge and Information Systems, 2012, 32 : 109 - 139
  • [4] Clustering XML documents based on structural similarity
    Xing, Guangming
    Xia, Zhonghang
    Guo, Jinhua
    [J]. ADVANCES IN DATABASES: CONCEPTS, SYSTEMS AND APPLICATIONS, 2007, 4443 : 905 - +
  • [5] Semantic Clustering of XML Documents
    Tagarelli, Andrea
    Greco, Sergio
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2010, 28 (01)
  • [6] A METHODOLOGY FOR USING EDGES TO MEASURE STRUCTURAL AND SEMANTIC SIMILARITY OF XML DOCUMENTS
    Qiu, Hong-Jun
    Yu, Wen-Jing
    [J]. PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 1653 - +
  • [7] A progressive clustering algorithm to group the XML data by structural and semantic similarity
    Nayak, Richi
    Tran, Tien
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2007, 21 (04) : 723 - 743
  • [8] Structure and Content Similarity for Clustering XML Documents
    Zhang, Lijun
    Li, Zhanhuai
    Chen, Qun
    Li, Ning
    [J]. WEB-AGE INFORMATION MANAGEMENT, 2010, 6185 : 116 - 124
  • [9] Clustering XML Documents Using Closed Frequent Subtrees: A Structural Similarity Approach
    Kutty, Sangeetha
    Tran, Tien
    Nayak, Richi
    Li, Yuefeng
    [J]. FOCUSED ACCESS TO XML DOCUMENTS, 2008, 4862 : 183 - 194
  • [10] Structural similarity between XML documents and DTDs
    Ng, PKL
    Ng, VTY
    [J]. COMPUTATIONAL SICENCE - ICCS 2003, PT III, PROCEEDINGS, 2003, 2659 : 412 - 421