Incremental mining of the schema of semistructured data

被引:1
|
作者
Zhou, AY [1 ]
Jin, W [1 ]
Zhou, SG [1 ]
Qian, WN [1 ]
Tian, ZP [1 ]
机构
[1] Fudan Univ, Dept Comp Sci, Shanghai 200433, Peoples R China
基金
中国国家自然科学基金;
关键词
data mining; incremental mining; semistructured data; schema; algorithm;
D O I
10.1007/BF02948811
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Semistructured data are specified in lack of any fixed and rigid schema, even though typically some implicit structure appears in the data. The huge amounts of on-line applications make it important and imperative to mine the schema of semistructured data, both for the users (e.g., to gather useful information and facilitate querying) and for the systems (e.g., to optimize access). The critical problem is to discover the hidden structure in the semistructured data. Current methods in extracting Web data structure are either in a general way independent of application background, or bound in some concrete environment such as HTML, XML etc. But both face the burden of expensive cost and difficulty in keeping along with the frequent and complicated variances of Web data. In this paper, the problem of incremental mining of schema for semistructured data after the update of the raw data is discussed. An algorithm for incrementally mining the schema of semistructured data is provided, and some experimental results are also given, which show that incremental mining for semistructured data is more efficient than non-incremental mining.
引用
收藏
页码:241 / 248
页数:8
相关论文
共 50 条
  • [1] Incremental mining of the schema of semistructured data
    Aoying Zhou
    Wen Jin
    Shuigeng Zhou
    Weining Qian
    Zenping Tian
    [J]. Journal of Computer Science and Technology, 2000, 15 : 241 - 248
  • [2] Interactive mining of schema for semistructured data
    Liu, YB
    Feng, YC
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY: THEORY, TOOLS AND TECHNOLOGY IV, 2002, 4730 : 432 - 441
  • [3] Incremental Mining of the Schema ofSemistructured Data
    周傲英
    金文
    周水庚
    钱卫宁
    田增平
    [J]. Journal of Computer Science & Technology, 2000, (03) : 241 - 248
  • [4] Schema Mining: Finding Structural Regularity among Semistructured Data
    Laur, P. A.
    Masseglia, F.
    Poncelet, P.
    [J]. LECTURE NOTES IN COMPUTER SCIENCE <D>, 2000, 1910 : 498 - 503
  • [5] Incremental maintenance for views on semistructured data
    Wang, M.Z.
    Liu, H.L.
    Shi, B.L.
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2001, 38 (02):
  • [6] Extracting Schema from Semistructured Data with Weight Tag
    Li, Jiuzhong
    Shi, Shuo
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2009, PT 3, PROCEEDINGS, 2009, 5553 : 1137 - 1145
  • [7] Schema extracting and query processing for semistructured data in COMMIX
    Wang, T.J.
    Tang, S.W.
    Yang, D.Q.
    Liu, Y.F.
    Tong, Y.H.
    [J]. Ruan Jian Xue Bao/Journal of Software, 2001, 12 (SUPPL.): : 230 - 236
  • [8] Extracting typical classes and a database schema from semistructured data
    Suzuki, N
    Sato, Y
    Hayase, M
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2001, E84D (01): : 100 - 112
  • [9] Incremental Schema Discovery at Scale for RDF Data
    Bouhamoum, Redouane
    Kedad, Zoubida
    Lopes, Stephane
    [J]. SEMANTIC WEB, ESWC 2021, 2021, 12731 : 195 - 211
  • [10] Automatic wrapper system for semistructured documents based on data mining
    [J]. Rancea, I. (irina.rancea@gmail.com), 2012, (74):