Efficient algorithms for finding frequent substructures from semi-structured data streams

被引:0
|
作者
Asai, Tatsuya [1 ]
Abe, Kenji [1 ]
Kawasoe, Shinji [1 ]
Arimura, Hiroki [1 ]
Arikawa, Setsuo [1 ]
机构
[1] Kyushu Univ, Higashi Ku, 6-10-1 Hakozaki, Fukuoka 8128581, Japan
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we study an online data mining problem from streams of semi-structured data such as XML data. Modeling semi-structured data and patterns as labeled ordered trees, we present an online algorithm StreamT that receives fragments of an unseen possibly infinite semi-structured data in the document order through a data stream, and can return the current set of frequent patterns immediately on request at any time. We give modifications of the algorithm to other online mining models. Furthermore we implement our algorithms in different online models and candidate management strategies, then show empirical analyses to evaluate the algorithms.
引用
收藏
页码:29 / +
页数:3
相关论文
共 50 条
  • [1] Discovering frequent substructures from hierarchical semi-structured data
    Cong, G
    Yi, L
    Liu, B
    Wang, K
    [J]. PROCEEDINGS OF THE SECOND SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2002, : 175 - +
  • [2] Efficient algorithms for mining frequent and closed patterns from semi-structured data
    Arimura, Hiroki
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2008, 5012 : 2 - +
  • [3] TreeScope: Finding Structural Anomalies In Semi-Structured Data
    Ying, Shanshan
    Korn, Flip
    Saha, Barna
    Srivastava, Divesh
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2015, 8 (12): : 1905 - 1908
  • [4] Efficient substructure discovery from large semi-structured data
    Asai, T
    Abe, K
    Kawasoe, S
    Sakamoto, H
    Arimura, H
    Arikawa, S
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (12): : 2754 - 2763
  • [5] WICCAO: From semi-structured data to structured data
    Li, Z
    Ng, WK
    [J]. 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOP ON THE ENGINEERING OF COMPUTER-BASED SYSTEMS, PROCEEDINGS, 2004, : 86 - 93
  • [6] Efficient substructure discovery from large semi-structured data
    Asai, T
    Abe, K
    Kawaoe, S
    Arimura, H
    Sakamoto, H
    Arikawa, S
    [J]. PROCEEDINGS OF THE SECOND SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2002, : 158 - 174
  • [7] Online algorithms for mining semi-structured data stream
    Asai, T
    Arimura, H
    Abe, K
    Kawasoe, S
    Arikawa, S
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2002, : 27 - 34
  • [8] Context-Aware Duplicate Detection in Semi-structured Data Streams
    Shukla, Parijat
    Somani, Arun K.
    [J]. 2014 IEEE WORLD CONGRESS ON SERVICES (SERVICES), 2014, : 216 - 223
  • [9] Querying semi-structured data
    Abiteboul, S
    [J]. DATABASE THEORY - ICDT'97, 1997, 1186 : 1 - 18
  • [10] An Efficient Hardware Architecture for Finding Frequent Items in Data Streams
    Ebrahim, Ali
    Khlaifat, Jalal
    [J]. 2020 IEEE 38TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2020), 2020, : 113 - 119