A distributed approach for graph mining in massive networks

被引:56
|
作者
Talukder, N. [1 ]
Zaki, M. J. [1 ]
机构
[1] Rensselaer Polytech Inst, Troy, NY 12180 USA
基金
美国国家科学基金会;
关键词
Parallel graph mining; Distributed graph mining; Single large graph; Frequent subgraph mining; High performance computing; ALGORITHM;
D O I
10.1007/s10618-016-0466-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel distributed algorithm for mining frequent subgraphs from a single, very large, labeled network. Our approach is the first distributed method to mine a massive input graph that is too large to fit in the memory of any individual compute node. The input graph thus has to be partitioned among the nodes, which can lead to potential false negatives. Furthermore, for scalable performance it is crucial to minimize the communication among the compute nodes. Our algorithm, DistGraph, ensures that there are no false negatives, and uses a set of optimizations and efficient collective communication operations to minimize information exchange. To our knowledge DistGraph is the first approach demonstrated to scale to graphs with over a billion vertices and edges. Scalability results on up to 2048 IBM Blue Gene/Q compute nodes, with 16 cores each, show very good speedup.
引用
收藏
页码:1024 / 1052
页数:29
相关论文
共 50 条
  • [41] DISTRIBUTED SCHEDULING USING GRAPH NEURAL NETWORKS
    Zhao, Zhongyuan
    Verma, Gunjan
    Rao, Chirag
    Swami, Ananthram
    Segarra, Santiago
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 4720 - 4724
  • [42] Massive-Scale Graph Mining Technique for Entrepreneurial Debt Analysis
    Xie, Zhengjuan
    Xu, Honghai
    IEEE ACCESS, 2024, 12 : 72374 - 72381
  • [43] Challenges for data mining in distributed sensor networks
    Cantoni, Virginio
    Lombardi, Luca
    Lombardi, Paolo
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2006, : 1000 - +
  • [44] Distributed Mining of Popular Paths in Road Networks
    Katsikouli, Panagiota
    Astefanoaei, Maria Sinziana
    Sarkar, Rik
    2018 14TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING IN SENSOR SYSTEMS (DCOSS), 2018, : 1 - 8
  • [45] Challenges for data mining in distributed sensor networks
    Cantoni, Virginio
    Lombardi, Luca
    Lombardi, Paolo
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS, 2006, : 378 - +
  • [46] Distributed Data Mining in Wireless Sensor Networks
    Du Juan
    Wu Fenfen
    INTERNATIONAL JOURNAL OF ONLINE ENGINEERING, 2016, 12 (11) : 68 - 71
  • [47] Trajectory Data Mining in Distributed Sensor Networks
    Qiao, Shaojie
    Jin, Huidong
    Gao, Yunjun
    Tang, Lu-An
    Xing, Huanlai
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2015,
  • [48] Tesseract: Distributed, General Graph Pattern Mining on Evolving Graphs
    Bindschaedler, Laurent
    Malicevic, Jasmina
    Lepers, Baptiste
    Goel, Ashvin
    Zwaenepoel, Willy
    PROCEEDINGS OF THE SIXTEENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS (EUROSYS '21), 2021, : 458 - 473
  • [49] Mining globally distributed frequent subgraphs in a single labeled graph
    Jiang, Xing
    Xiong, Hui
    Wang, Chen
    Tan, Ah-Hwee
    DATA & KNOWLEDGE ENGINEERING, 2009, 68 (10) : 1034 - 1058
  • [50] Distributed frequent subgraph mining on evolving graph using SPARK
    Senthilselvan, N.
    Subramaniyaswamy, V.
    Vijayakumar, V.
    Karimi, Hamid Reza
    Aswin, N.
    Ravi, Logesh
    INTELLIGENT DATA ANALYSIS, 2020, 24 (03) : 495 - 513