Overview of the INEX 2008 XML Mining Track Categorization and Clustering of XML Documents in a Graph of Documents

被引:0
|
作者
Denoyer, Ludovic [1 ]
Gallinari, Patrick [1 ]
机构
[1] Univ Paris 06, LIP6, F-75252 Paris 05, France
来源
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We describe here the XML Mining Track at IN EX 2008. This track was launched for exploring two main ideas: first identifying key problems for mining semi-structured documents and new challenges of this emerging field and second studying and assessing the potential of machine learning techniques for dealing with generic Machine Learning (ML) tasks in the structured domain i.e. classification and clustering of semi structured documents. This year, the track focuses on the supervised classification and the unsupervised clustering of XML documents using link information. We consider a corpus of about 100,000 Wikipedia pages with the associated hyperlinks. The participants have developed models using the content information, the internal structure information of the XML documents and also the link information between documents.
引用
收藏
页码:401 / 411
页数:11
相关论文
共 50 条
  • [1] Overview of the INEX 2009 XML Mining Track: Clustering and Classification of XML Documents
    Nayak, Richi
    De Vries, Christopher M.
    Kutty, Sangeetha
    Geva, Shlomo
    Denoyer, Ludovic
    Gallinari, Patrick
    [J]. FOCUSED RETRIEVAL AND EVALUATION, 2010, 6203 : 366 - +
  • [2] Overview of the INEX 2010 XML Mining Track: Clustering and Classification of XML Documents
    De Vries, Christopher M.
    Nayak, Richi
    Kutty, Sangeetha
    Geva, Shlomo
    Tagarelli, Andrea
    [J]. COMPARATIVE EVALUATION OF FOCUSED RETRIEVAL, 2011, 6932 : 363 - +
  • [3] UJM at INEX 2008 XML Mining Track
    Gery, Mathias
    Largeron, Christine
    Moulin, Christophe
    [J]. ADVANCES IN FOCUSED RETRIEVAL, 2009, 5631 : 446 - +
  • [4] Clustering of XML documents
    Guillaume, D
    Murtagh, F
    [J]. COMPUTER PHYSICS COMMUNICATIONS, 2000, 127 (2-3) : 215 - 227
  • [5] UJM at INEX 2009 XML Mining Track
    Largeron, Christine
    Moulin, Christophe
    Gery, Mathias
    [J]. FOCUSED RETRIEVAL AND EVALUATION, 2010, 6203 : 426 - +
  • [6] PKU at INEX 2010 XML Mining Track
    Wang, Songlin
    Liang, Feng
    Yang, Jianwu
    [J]. COMPARATIVE EVALUATION OF FOCUSED RETRIEVAL, 2011, 6932 : 383 - 395
  • [7] Clustering XML documents by structure
    Dalamagas, T
    Cheng, T
    Winkel, KJ
    Sellis, T
    [J]. METHODS AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3025 : 112 - 121
  • [8] Clustering XML Documents by Structure
    Lesniewska, Anna
    [J]. ADVANCES IN DATABASES AND INFORMATION SYSTEMS, 2010, 5968 : 238 - 246
  • [9] Clustering schemaless XML documents
    Shen, Y
    Wang, B
    [J]. ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2003: COOPIS, DOA, AND ODBASE, 2003, 2888 : 767 - 784
  • [10] Semantic Clustering of XML Documents
    Tagarelli, Andrea
    Greco, Sergio
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2010, 28 (01)