CLS and CLS Close: The Scalable Method for Mining the Semi Structured Data Set

被引:0
|
作者
Gaol, Ford Lumban [1 ]
Widjaja, Belawati H. [1 ]
机构
[1] Univ Indonesia, Fac Comp Sci, Bogor, Indonesia
关键词
frequent pattern; closed pattern; graph mining; CLS code; canonical label;
D O I
10.1007/978-1-4020-8735-6_35
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Semistructured pattern can be formally modeled as Graph Pattern. The most important problem to be solved in mining large semi structured dataset is the scalability of the method. With the successful development of efficient and scalable algorithms for mining frequent itemsets and sequences, it is natural to extend the scope of study to a more general pattern mining problem: mining frequent semistructured patterns or graph patterns. In this paper, we extend the methodology of pattern-growth and develop a novel algorithm called CLS (Canonical Labeling System), which discovers frequent connected subgraphs efficiently using either depth-first search or breadth-first search strategy. A novel canonical labeling system and search order are devised to support efficient pattern growth. CLS has advantages of simplicity and efficiency over other methods since it combines pattern growing and pattern checking into one procedure. Based on CLS, we develop CLS Close to mine closed frequent graphs, which not only eliminates redundant patterns but also substantially increases the efficiency of mining, especially in the presence of large graph patterns.
引用
收藏
页码:186 / +
页数:2
相关论文
共 50 条
  • [41] Key parameter mining method of AVC system based on rough set driven by historical data
    Chen, Guangyu
    Huang, Yuehui
    Zhang, Yangfei
    Hao, Sipeng
    Zhang, Youquan
    Lü, Ganyun
    [J]. Dianli Zidonghua Shebei/Electric Power Automation Equipment, 2020, 40 (06): : 210 - 217
  • [42] Data mining method of road transportation management information based on rough set and association rule
    [J]. Zheng, X.-F. (bobcraft@163.com), 1600, South China University of Technology (42):
  • [43] Hybrid Machine Learning Approaches A Method to Improve Expected Output of Semi-structured Sequential Data
    Abdelrahim, Mohammed
    Merlos, Carlos
    Wang, Taehyung
    [J]. 2016 IEEE TENTH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2016, : 341 - 344
  • [44] Which Data Gathering Method is Superior: An open-ended Questionnaire or a Semi-structured Interview?
    Saglam, Yilmaz
    [J]. INTERNATIONAL JOURNAL ON STUDIES IN EDUCATION, 2024, 6 (03):
  • [45] IRPDP_HT2: a scalable data pre-processing method in web usage mining using Hadoop MapReduce
    Srivastava, Atul Kumar
    Srivastava, Mitali
    [J]. SOFT COMPUTING, 2023, 27 (12) : 7907 - 7923
  • [46] IRPDP_HT2: a scalable data pre-processing method in web usage mining using Hadoop MapReduce
    Atul Kumar Srivastava
    Mitali Srivastava
    [J]. Soft Computing, 2023, 27 : 7907 - 7923
  • [47] Database Search vs. Information Retrieval: A Novel Method for Studying Natural Language Querying of Semi-Structured Data
    Nadig, Stefanie
    Braschler, Martin
    Stockinger, Kurt
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 1772 - 1779
  • [48] Turning Talk into "Rationales": Using the Extended Case Method for the Coding and Analysis of Semi-Structured Interview Data in ATLAS.ti
    Vila-Henninger, Luis Antonio
    [J]. BMS-BULLETIN OF SOCIOLOGICAL METHODOLOGY-BULLETIN DE METHODOLOGIE SOCIOLOGIQUE, 2019, 143 (01): : 28 - 52
  • [49] Novel Method to Flag Cardiac Implantable Device Infections by Integrating Text Mining With Structured Data in the Veterans Health Administration's Electronic Medical Record
    Mull, Hillary J.
    Stolzmann, Kelly L.
    Shin, Marlena H.
    Kalver, Emily
    Schweizer, Marin L.
    Branch-Elliman, Westyn
    [J]. JAMA NETWORK OPEN, 2020, 3 (09) : E2012264
  • [50] A Simple Free-Text-like Method for Extracting Semi-Structured Data from Electronic Health Records: Exemplified in Prediction of In-Hospital Mortality
    Klang, Eyal
    Levin, Matthew A.
    Soffer, Shelly
    Zebrowski, Alexis
    Glicksberg, Benjamin S.
    Carr, Brendan G.
    Mcgreevy, Jolion
    Reich, David L.
    Freeman, Robert
    [J]. BIG DATA AND COGNITIVE COMPUTING, 2021, 5 (03)