High Performance Frequent Subgraph Mining on Transaction Datasets: A Survey and Performance Comparison

被引:1
|
作者
Bismita S.Jena
Cynthia Khan
Rajshekhar Sunderraman
机构
[1] Department of Computer Science
[2] Georgia State University
关键词
frequent subgraphs; isomorphism; Spark;
D O I
暂无
中图分类号
TP391.41 [];
学科分类号
080203 ;
摘要
Graph data mining has been a crucial as well as inevitable area of research. Large amounts of graph data are produced in many areas, such as Bioinformatics, Cheminformatics, Social Networks, etc. Scalable graph data mining methods are getting increasingly popular and necessary due to increased graph complexities. Frequent subgraph mining is one such area where the task is to find overly recurring patterns/subgraphs. To tackle this problem, many main memory-based methods were proposed, which proved to be inefficient as the data size grew exponentially over time. In the past few years, several research groups have attempted to handle the Frequent Subgraph Mining(FSM) problem in multiple ways. Many authors have tried to achieve better performance using Graphic Processing Units(GPUs) which has multi-fold improvement over in-memory while dealing with large datasets. Later, Google’s MapReduce model with the Hadoop framework proved to be a major breakthrough in high performance large batch processing. Although MapReduce came with many benefits, its disk I/O and noniterative style model could not help much for FSM domain since subgraph mining process is an iterative approach.In recent years, Spark has emerged to be the De Facto industry standard with its distributed in-memory computing capability. This is a right fit solution for iterative style of programming as well. In this survey, we cover how high-performance computing has helped in improving the performance tremendously in the transactional directed and undirected aspect of graphs and performance comparisons of various FSM techniques are done based on experimental results.
引用
收藏
页码:159 / 180
页数:22
相关论文
共 50 条
  • [21] IMAGEPLUS HIGH-PERFORMANCE TRANSACTION SYSTEM
    DINAN, RF
    PAINTER, LD
    RODITE, RR
    IBM SYSTEMS JOURNAL, 1990, 29 (03) : 421 - 434
  • [22] Ramp:: High performance frequent itemset mining with efficient bit-vector projection technique
    Bashir, Shariq
    Baig, Abdul Rauf
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2006, 3918 : 504 - 508
  • [23] Performance Evaluation of Methods for Mining Frequent Itemsets on Temporal Data
    Tripathi, Tripti
    Yadav, Divakar
    SECOND INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS AND COMMUNICATION TECHNOLOGIES, ICCNCT 2019, 2020, 44 : 910 - 917
  • [24] A Performance based Empirical Study of the Frequent Itemset Mining Algorithms
    Sivakumar, Ramah
    Sathiaseelan, J. G. R.
    2017 IEEE INTERNATIONAL CONFERENCE ON POWER, CONTROL, SIGNALS AND INSTRUMENTATION ENGINEERING (ICPCSI), 2017, : 1627 - 1631
  • [25] High performance data mining
    Kumar, V
    Joshi, MV
    Han, EH
    Tan, PN
    Steinbach, M
    HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2002, 2003, 2565 : 111 - 125
  • [26] Tendinopathy more frequent in high performance athletes
    Leon Valladares, Dayneri
    Barrio Mateu, Luis A.
    Perez Leon, Alex
    Benitez Leon, Jesus
    Ponce Farfan, Osman
    Lagos Olivos, Carlos
    MEDICINA DELLO SPORT, 2017, 70 (02) : 212 - 221
  • [27] Performance Analysis and Ranking of Data Mining Algorithms Across Multiple Datasets
    Nasor, Mohamed
    Ali, Sharaz
    2019 IEEE 19TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT 2019), 2019,
  • [28] Performance comparison of linear and non-linear feature selection methods for the analysis of large survey datasets
    Krakovska, Olga
    Christie, Gregory
    Sixsmith, Andrew
    Ester, Martin
    Moreno, Sylvain
    PLOS ONE, 2019, 14 (03):
  • [29] Efficient testing of high performance transaction processing systems
    Wildfogel, D
    Yerneni, R
    PROCEEDINGS OF THE TWENTY-THIRD INTERNATIONAL CONFERENCE ON VERY LARGE DATABASES, 1997, : 595 - 598
  • [30] A high performance transaction processing algorithm for mobile computing
    Lee, J
    Simpson, K
    INTELLIGENT INFORMATION SYSTEMS, (IIS'97) PROCEEDINGS, 1997, : 486 - 491