A Schema Feature Based Frequent Pattern Mining Algorithm for Semi-structured Data Stream

被引:0
|
作者
Fu, Weiqi [1 ]
Liao, Husheng [1 ]
Jin, Xueyun [1 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 2017 5TH INTERNATIONAL CONFERENCE ON FRONTIERS OF MANUFACTURING SCIENCE AND MEASURING TECHNOLOGY (FMSMT 2017) | 2017年 / 130卷
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
frequent pattern mining; semi-structured data stream; schema feature;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data mining is used to find useful information from massive data. Frequent pattern mining is one important task of data mining. Recently, the researches on frequent pattern mining for semi-structured data have made some progresses, and it also have a lot of focuses for data stream. However, only a few studies focus on both semi-structured data and data stream. This paper proposes an algorithm named SPrefixTreeISpan. We segment the semi-structured data stream first, and then uses the pattern-growth method to mine each segment. In the end, we maintain all the results on a structure called patternTree. At the same time, the mining algorithm is optimized by the inevitable parent-child relationship and the inevitable child-parent relationship extracted from XML schema. Experiment shows that SPrefixTreeISpan has better performance.
引用
收藏
页码:1329 / 1336
页数:8
相关论文
共 50 条
  • [41] Privacy Preservation of Semi-structured Data Based on XML
    Shi, Cheng
    Yang, Mingda
    Ning, Bo
    COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, CSPS 2018, VOL III: SYSTEMS, 2020, 517 : 1081 - 1088
  • [42] Semi-structured documents mining: a review and comparison
    Madani, Amina
    Boussaid, Omar
    Zegour, Djamel Eddine
    17TH INTERNATIONAL CONFERENCE IN KNOWLEDGE BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS - KES2013, 2013, 22 : 330 - 339
  • [43] A view-based approach to the integration of structured and semi-structured data
    Ahmad, Honda
    Kermanshahani, Shokooh
    Simonet, Ana
    Simonet, Michel
    DATABASES AND INFORMATION SYSTEMS: COMMUNICATIONS, MATERIALS OF DOCTORAL CONSORTIUM, 2006, : 41 - 51
  • [44] An Algorithm for Constructing a Topological Skeleton for Semi-structured Spatial Data Based on Persistent Homology
    Eremeev, Sergey
    Romanov, Semyon
    ANALYSIS OF IMAGES, SOCIAL NETWORKS AND TEXTS (AIST 2019), 2020, 1086 : 16 - 26
  • [45] A Pattern Decomposition Algorithm for Data Mining of Frequent Patterns
    Zou, Qinghua
    Chu, Wesley
    Johnson, David
    Chiu, Henry
    Knowledge and Information Systems, 2002, 4 (04) : 466 - 482
  • [46] A MINING ALGORITHM FOR FREQUENT CLOSED PATTERN ON DATA STREAM BASED ON SUB-STRUCTURE COMPRESSED IN PREFIX-TREE
    Fan Muhan
    Shao Sujie
    Rui Lanlan
    PROCEEDINGS OF 2016 4TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (IEEE CCIS 2016), 2016, : 434 - 439
  • [47] A sliding window algorithm for mining frequent itemsets on data stream
    Liu, Junqiang
    Li, Xiurong
    DCABES 2006 PROCEEDINGS, VOLS 1 AND 2, 2006, : 637 - 639
  • [48] A Web Mining method based on personal ontology for semi-structured RDF
    Nakayama, K
    Hara, T
    Nishio, S
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2005 WORKSHOPS, PROCEEDINGS, 2005, 3807 : 227 - 234
  • [49] Query optimization for semi-structured data
    Li, GY
    Bian, S
    Zhang, J
    Xie, Y
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE & ENGINEERING, VOLS 1 AND 2, 2004, : 97 - 100
  • [50] History-based visual mining of semi-structured audio and text
    Bouamrane, Matt-Mouley
    Luz, Saturnino
    Masoodian, Masood
    12TH INTERNATIONAL MULTI-MEDIA MODELLING CONFERENCE PROCEEDINGS, 2006, : 360 - 363