Overview of the INEX 2008 XML Mining Track Categorization and Clustering of XML Documents in a Graph of Documents

被引:0
|
作者
Denoyer, Ludovic [1 ]
Gallinari, Patrick [1 ]
机构
[1] Univ Paris 06, LIP6, F-75252 Paris 05, France
来源
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We describe here the XML Mining Track at IN EX 2008. This track was launched for exploring two main ideas: first identifying key problems for mining semi-structured documents and new challenges of this emerging field and second studying and assessing the potential of machine learning techniques for dealing with generic Machine Learning (ML) tasks in the structured domain i.e. classification and clustering of semi structured documents. This year, the track focuses on the supervised classification and the unsupervised clustering of XML documents using link information. We consider a corpus of about 100,000 Wikipedia pages with the associated hyperlinks. The participants have developed models using the content information, the internal structure information of the XML documents and also the link information between documents.
引用
收藏
页码:401 / 411
页数:11
相关论文
共 50 条
  • [31] Structure and Content Similarity for Clustering XML Documents
    Zhang, Lijun
    Li, Zhanhuai
    Chen, Qun
    Li, Ning
    [J]. WEB-AGE INFORMATION MANAGEMENT, 2010, 6185 : 116 - 124
  • [32] Using structural similarity for clustering XML documents
    Ali Aïtelhadj
    Mohand Boughanem
    Mohamed Mezghiche
    Fatiha Souam
    [J]. Knowledge and Information Systems, 2012, 32 : 109 - 139
  • [33] Clustering XML Documents by Combining Content and Structure
    Guo Yongming
    Chen Dehua
    Le Jiajin
    [J]. ISISE 2008: INTERNATIONAL SYMPOSIUM ON INFORMATION SCIENCE AND ENGINEERING, VOL 1, 2008, : 583 - 587
  • [34] XML Documents Clustering based on Representative Path
    Kim, Woosaeng
    [J]. PROCEEDINGS OF THE 13TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTERS, 2009, : 108 - +
  • [35] Improve query performance by clustering XML documents
    Wang, L
    Cheung, DW
    Mamoulis, N
    Yiu, SM
    [J]. INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATIONS AND CONTROL TECHNOLOGIES, VOL 6, POST-CONFERENCE ISSUE, PROCEEDINGS, 2004, : 329 - 334
  • [36] A Framework for Clustering and Dynamic Maintenance of XML Documents
    Al-Shammari, Ahmed
    Liu, Chengfei
    Naseriparsa, Mehdi
    Bao Quoc Vo
    Anwar, Tarique
    Zhou, Rui
    [J]. ADVANCED DATA MINING AND APPLICATIONS, ADMA 2017, 2017, 10604 : 399 - 412
  • [37] Clustering XML Documents based on Data Type
    Zhou, Chong
    Lu, Yansheng
    [J]. 2008 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY, VOLS 1 AND 2, PROCEEDINGS, 2008, : 685 - 690
  • [38] Clustering XML documents based on structural similarity
    Xing, Guangming
    Xia, Zhonghang
    Guo, Jinhua
    [J]. ADVANCES IN DATABASES: CONCEPTS, SYSTEMS AND APPLICATIONS, 2007, 4443 : 905 - +
  • [39] Semantic Structural Similarity for Clustering XML Documents
    Kim, Tae-Soon
    Lee, Ju-Hong
    Song, Jae-Won
    [J]. ICHIT 2008: INTERNATIONAL CONFERENCE ON CONVERGENCE AND HYBRID INFORMATION TECHNOLOGY, PROCEEDINGS, 2008, : 552 - 557
  • [40] Clustering XML Documents for Web Based Learning
    Periakaruppan, Ramanathan
    Nadarajan, Rethinaswamy
    [J]. ADVANCES IN WEB-BASED LEARNING, 2015, 8390 : 234 - 243