High Performance Frequent Subgraph Mining on Transaction Datasets: A Survey and Performance Comparison

被引:1
|
作者
Bismita S.Jena
Cynthia Khan
Rajshekhar Sunderraman
机构
[1] Department of Computer Science
[2] Georgia State University
关键词
frequent subgraphs; isomorphism; Spark;
D O I
暂无
中图分类号
TP391.41 [];
学科分类号
080203 ;
摘要
Graph data mining has been a crucial as well as inevitable area of research. Large amounts of graph data are produced in many areas, such as Bioinformatics, Cheminformatics, Social Networks, etc. Scalable graph data mining methods are getting increasingly popular and necessary due to increased graph complexities. Frequent subgraph mining is one such area where the task is to find overly recurring patterns/subgraphs. To tackle this problem, many main memory-based methods were proposed, which proved to be inefficient as the data size grew exponentially over time. In the past few years, several research groups have attempted to handle the Frequent Subgraph Mining(FSM) problem in multiple ways. Many authors have tried to achieve better performance using Graphic Processing Units(GPUs) which has multi-fold improvement over in-memory while dealing with large datasets. Later, Google’s MapReduce model with the Hadoop framework proved to be a major breakthrough in high performance large batch processing. Although MapReduce came with many benefits, its disk I/O and noniterative style model could not help much for FSM domain since subgraph mining process is an iterative approach.In recent years, Spark has emerged to be the De Facto industry standard with its distributed in-memory computing capability. This is a right fit solution for iterative style of programming as well. In this survey, we cover how high-performance computing has helped in improving the performance tremendously in the transactional directed and undirected aspect of graphs and performance comparisons of various FSM techniques are done based on experimental results.
引用
收藏
页码:159 / 180
页数:22
相关论文
共 50 条
  • [41] Instruction Scheduling Based on Subgraph Isomorphism for a High Performance Computer Processor
    Santos, Ricardo
    Azevedo, Rodolfo
    Araujo, Guido
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2008, 14 (21) : 3465 - 3480
  • [42] Strategies for the Storage of Large LiDAR Datasets-A Performance Comparison
    Bejar-Martos, Juan A.
    Rueda-Ruiz, Antonio J.
    Ogayar-Anguita, Carlos J.
    Segura-Sanchez, Rafael J.
    Lopez-Ruiz, Alfonso
    REMOTE SENSING, 2022, 14 (11)
  • [43] Couple Approach to Fixture Design Based on Maximum Common Subgraph Mining and Fixturing Performance Analysis
    Qin G.
    Qiu J.
    Wang H.
    Wu T.
    Jixie Gongcheng Xuebao/Journal of Mechanical Engineering, 2019, 55 (17): : 185 - 199
  • [44] Prediction of multicore CPU performance through parallel data mining on public datasets
    Upadhyay, Navin Mani
    Singh, Ravi Shankar
    Dwivedi, Shri Prakash
    DISPLAYS, 2022, 71
  • [45] High-Performance Transaction Processing in Journaling File Systems
    Son, Yongseok
    Kim, Sunggon
    Yeom, Young
    Han, Hyuck
    PROCEEDINGS OF THE 16TH USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES, 2018, : 227 - 240
  • [46] Multiprocessor Database Systems for High Performance Transaction Systems.
    Haerder, T.
    Rahm, E.
    Informationstechnik it: Computer, Systeme, Anwendungen, 1986, 28 (04): : 214 - 225
  • [47] Higher Performance IPPC+ Tree for Parallel Incremental Frequent Itemsets Mining
    Van Quoc Phuong Huynh
    Kueng, Josef
    FUTURE DATA AND SECURITY ENGINEERING, FDSE 2018, 2018, 11251 : 127 - 144
  • [48] Paradigm and performance analysis of distributed frequent itemset mining algorithms based on Mapreduce
    Xiao, Wen
    Hu, Juan
    MICROPROCESSORS AND MICROSYSTEMS, 2021, 82
  • [49] Periodic Transaction Processing Architecture with High Performance in Financial Systems
    Gundebahar, Mucahit
    Bastas, Sinem Zeynep
    2013 INTERNATIONAL CONFERENCE ON TECHNOLOGICAL ADVANCES IN ELECTRICAL, ELECTRONICS AND COMPUTER ENGINEERING (TAEECE), 2013, : 371 - 374
  • [50] REPORT ON THE INTERNATIONAL WORKSHOP ON HIGH-PERFORMANCE TRANSACTION SYSTEMS
    GAWLICK, D
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 1986, 11 (04): : 375 - 377