A Data Mining Approach to XML Dissemination

被引:0
|
作者
Wang, Xiaoling [1 ]
Ester, Martin [2 ]
Qian, Weining [1 ]
Zhou, Aoying [1 ]
机构
[1] East China Normal Univ, Inst Software Engn, Shanghai, Peoples R China
[2] Simon Fraser Univ, Sch Comp Sci, Burnaby, BC, Canada
关键词
XML Classification; pattern matching; XML dissemination; frequent structural pattern; feature vector;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Currently user's interests are expressed by XPath or XQuery queries in XML dissemination applications. These queries require a good knowledge of the structure and contents of the documents that will arrive; As well as knowledge of XQuery which few consumers will have. In some cases, where the distinction of relevant and irrelevant documents requires the consideration of a large number of features, the query may be impossible. This paper introduces a data mining approach to XML dissemination that uses a given document collection of the user to automatically learn a classifier modelling of his/her information needs. Also discussed are the corresponding optimization methods that allow a dissemination server to execute a massive number of classifiers simultaneously. The experimental evaluation of several real XML document sets demonstrates the accuracy and efficiency of the proposed approach.
引用
收藏
页码:442 / +
页数:2
相关论文
共 50 条
  • [1] Mining XML data: A clustering approach
    Saraee, M
    Aljibouri, JM
    [J]. DMIN '05: Proceedings of the 2005 International Conference on Data Mining, 2005, : 283 - 288
  • [2] Quality Data for Data Mining and Data Mining for Quality Data: A Constraint Based Approach in XML
    Shahriar, Md. Sumon
    Anam, Sarawat
    [J]. 2008 SECOND INTERNATIONAL CONFERENCE ON FUTURE GENERATION COMMUNICATION AND NETWORKING SYMPOSIA, VOLS 1-5, PROCEEDINGS, 2008, : 142 - +
  • [3] XML data mining
    Romei, Andrea
    Turini, Franco
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 2010, 40 (02): : 101 - 130
  • [4] XML routing in data dissemination networks
    Li, Guoli
    Hou, Shuang
    Jacobsen, Hans-Arno
    [J]. 2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2007, : 1375 - +
  • [5] Piggyback optimization of XML data dissemination
    Chan, Chee-Yong
    Ni, Yuan
    [J]. 2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2007, : 1429 - +
  • [6] XML algebras for data mining
    Zhang, M
    Yao, JT
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY: THEORY, TOOLS, AND TECHNOLOGY VI, 2004, 5433 : 209 - 217
  • [7] Scalable Approach for Mining Association Rules from Structured XML Data
    Abazeed, Ashraf
    Mamat, Ali
    Sulaiman, Md Nasir
    Ibrahim, Hamidah
    [J]. 2009 2ND CONFERENCE ON DATA MINING AND OPTIMIZATION, 2009, : 5 - 9
  • [8] Association-rules mining based broadcasting approach for XML data
    Chenier, Cameron
    Jun, J. James
    Zhang, Jason
    Ozyer, Tansel
    Alhajj, Reda
    [J]. ADVANCES IN INFORMATION SYSTEMS, PROCEEDINGS, 2006, 4243 : 207 - 216
  • [9] Routing of XML and XPath Queries in Data Dissemination Networks
    Li, Guoli
    Hou, Shuang
    Jacobsen, Hans-Arno
    [J]. 28TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, VOLS 1 AND 2, PROCEEDINGS, 2008, : 627 - 638
  • [10] Towards Efficient Dissemination and Filtering of XML Data Streams
    Belyaev, Kirill
    Ray, Indrakshi
    [J]. CIT/IUCC/DASC/PICOM 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY - UBIQUITOUS COMPUTING AND COMMUNICATIONS - DEPENDABLE, AUTONOMIC AND SECURE COMPUTING - PERVASIVE INTELLIGENCE AND COMPUTING, 2015, : 1871 - 1878