High Performance Frequent Subgraph Mining on Transaction Datasets: A Survey and Performance Comparison

被引：1

作者：

Bismita S.Jena

Cynthia Khan

Rajshekhar Sunderraman

机构：

[1] Department of Computer Science

[2] Georgia State University

来源：

Big Data Mining and Analytics | 2019年 / 2卷 / 03期

关键词：

frequent subgraphs; isomorphism; Spark;

D O I：

暂无

中图分类号：

TP391.41 [];

学科分类号：

080203 ;

摘要：

Graph data mining has been a crucial as well as inevitable area of research. Large amounts of graph data are produced in many areas, such as Bioinformatics, Cheminformatics, Social Networks, etc. Scalable graph data mining methods are getting increasingly popular and necessary due to increased graph complexities. Frequent subgraph mining is one such area where the task is to find overly recurring patterns/subgraphs. To tackle this problem, many main memory-based methods were proposed, which proved to be inefficient as the data size grew exponentially over time. In the past few years, several research groups have attempted to handle the Frequent Subgraph Mining(FSM) problem in multiple ways. Many authors have tried to achieve better performance using Graphic Processing Units(GPUs) which has multi-fold improvement over in-memory while dealing with large datasets. Later, Google’s MapReduce model with the Hadoop framework proved to be a major breakthrough in high performance large batch processing. Although MapReduce came with many benefits, its disk I/O and noniterative style model could not help much for FSM domain since subgraph mining process is an iterative approach.In recent years, Spark has emerged to be the De Facto industry standard with its distributed in-memory computing capability. This is a right fit solution for iterative style of programming as well. In this survey, we cover how high-performance computing has helped in improving the performance tremendously in the transactional directed and undirected aspect of graphs and performance comparisons of various FSM techniques are done based on experimental results.

引用

页码：159 / 180

页数：22

共 50 条

[31] Building extensible and high performance distributed transaction service
Zhang, X
Xu, C
Jin, BH
2004 AUSTRALIAN SOFTWARE ENGINEERING CONFERENCE, PROCEEDINGS, 2004, : 202 - 210
[32] Data heterogeneity's impact on the performance of frequent itemset mining algorithms
Trasierras, Antonio Manuel
Luna, Jose Maria
Fournier-Viger, Philippe
Ventura, Sebastian
INFORMATION SCIENCES, 2024, 678
[33] Performance study of distributed Apriori-like frequent itemsets mining
Lamine M. Aouad
Nhien-An Le-Khac
Tahar M. Kechadi
Knowledge and Information Systems, 2010, 23 : 55 - 72
[34] Performance study of distributed Apriori-like frequent itemsets mining
Aouad, Lamine M.
Le-Khac, Nhien-An
Kechadi, Tahar M.
KNOWLEDGE AND INFORMATION SYSTEMS, 2010, 23 (01) : 55 - 72
[35] Performance modeling of bitcoin blockchain: Mining mechanism and transaction-confirmation process
Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, Ikoma-shi
630-0192, Japan
IEICE Trans Commun, 2021, 12 (1455-1464):
[36] Performance Modeling of Bitcoin Blockchain: Mining Mechanism and Transaction-Confirmation Process
Kasahara, Shoji
IEICE TRANSACTIONS ON COMMUNICATIONS, 2021, E104B (12) : 1455 - 1464
[37] High-performance data mining
IBM, United States
IBM Data Manag. Mag., 2009, 3
[38] High Dimensional Exploration: A Comparison of PCA, Distance Concentration, and Classification Performance in two fMRI Datasets
Etzel, Joset A.
Braver, Todd S.
2014 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING (CIDM), 2014, : 157 - 162
[39] A Performance Comparison of Scheduling Distributed Mining in Cloud
Srikrishnan, V.
Sivasankar, E.
Pitchiah, R.
2014 FIRST INTERNATIONAL CONFERENCE ON NETWORKS & SOFT COMPUTING (ICNSC), 2014, : 375 - 379
[40] Understanding High-Performance Subgraph Pattern Matching: A Systems Perspective
Sharma, Akshit
Mehta, Dinesh
Wu, Bo
PROCEEDINGS OF THE 7TH ACM SIGMOD JOINT INTERNATIONAL WORKSHOP ON GRAPH DATA MANAGEMENT EXPERIENCES & SYSTEMS, GRADES 2024 AND NETWORK DATA ANALYTICS, NDA 2024, GRADES-NDA 2024, 2024,

← 1 2 3 4 5 →