Incremental mining of the schema of semistructured data

被引:0
|
作者
Aoying Zhou
Wen Jin
Shuigeng Zhou
Weining Qian
Zenping Tian
机构
[1] Fudan University,Department of Computer Science
关键词
data mining; incremental mining; semistructured data; schema; algorithm;
D O I
暂无
中图分类号
学科分类号
摘要
Semistructured data are specified in lack of any fixed and rigid schema, even though typically some implicit structure appears in the data. The huge amounts of on-line applications make it important and imperative to mine the schema of semistructured data, both for the users (e.g., to gather useful information and facilitate querying) and for the systems (e.g., to optimize access). The critical problem is to discover the hidden structure in the semistructured data. Current methods in extracting Web data structure are either in a general way independent of application background, or bound in some concrete environment such as HTML, XML etc. But both face the burden of expensive cost and difficulty in keeping along with the frequent and complicated variances of Web data. In this paper, the problem of incremental mining of schema for semistructured data after the update of the raw data is discussed. An algorithm for incrementally mining the schema of semistructured data is provided, and some experimental results are, also given, which show that incremental mining for semistructured data is more efficient than non-incremental mining.
引用
收藏
页码:241 / 248
页数:7
相关论文
共 50 条
  • [41] PTList: Mining XML Data Stream Using Paging Schema
    Lei Xiangxin
    Cao Shunliang
    Huang Shaoyin
    Yang Jianguo
    MEMS, NANO AND SMART SYSTEMS, PTS 1-6, 2012, 403-408 : 1888 - +
  • [42] Reciprocal knowledge use in the mining of semistructured data and HMM-based information extraction
    Maruyama, K
    Uehara, K
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 2006, 89 (07): : 51 - 60
  • [43] A polynomial time matching algorithm of structured ordered tree patterns for data mining from semistructured data
    Suzuki, Y
    Inomae, K
    Shoudai, T
    Miyahara, T
    Uchida, T
    INDUCTIVE LOGIC PROGRAMMING, 2003, 2583 : 270 - 284
  • [44] Semi-structured data extraction and schema knowledge mining
    Chen, E.
    Wang, X.
    High Technology Letters, 2001, 7 (01) : 1 - 5
  • [45] Large Database Schema Matching using Data Mining Techniques
    Reis, Debora G.
    Ladeira, Marcelo
    Holanda, Maristela
    Victorino, Marcio C.
    2018 18TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2018, : 523 - 530
  • [46] An Evolutionary Schema for Mining Skyline Clusters of Attributed Graph Data
    Dhifli, Wajdi
    Da Costa, Noemie Oliveira
    Elati, Mohamed
    2017 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2017, : 2102 - 2109
  • [47] Semi-structured Data Extraction and Schema Knowledge Mining
    陈恩红
    High Technology Letters, 2001, (01) : 1 - 5
  • [48] mStore: Schema Mining based-RDF Data Storage
    Zheng, Guopeng
    Ren, Tenglong
    Yang, Lulu
    Zhang, Xiaowang
    Feng, Zhiyong
    COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 168 - 171
  • [49] A Study on Designing a Layered Star Schema for Data Mining Optimization
    Raza, Md. Shams
    Nayak, A. K.
    2014 CONFERENCE ON IT IN BUSINESS, INDUSTRY AND GOVERNMENT (CSIBIG), 2014,
  • [50] Parallel Incremental Frequent Itemset Mining for Large Data
    Song, Yu-Geng
    Cui, Hui-Min
    Feng, Xiao-Bing
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2017, 32 (02) : 368 - 385