A distributed approach for graph mining in massive networks

被引:56
|
作者
Talukder, N. [1 ]
Zaki, M. J. [1 ]
机构
[1] Rensselaer Polytech Inst, Troy, NY 12180 USA
基金
美国国家科学基金会;
关键词
Parallel graph mining; Distributed graph mining; Single large graph; Frequent subgraph mining; High performance computing; ALGORITHM;
D O I
10.1007/s10618-016-0466-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel distributed algorithm for mining frequent subgraphs from a single, very large, labeled network. Our approach is the first distributed method to mine a massive input graph that is too large to fit in the memory of any individual compute node. The input graph thus has to be partitioned among the nodes, which can lead to potential false negatives. Furthermore, for scalable performance it is crucial to minimize the communication among the compute nodes. Our algorithm, DistGraph, ensures that there are no false negatives, and uses a set of optimizations and efficient collective communication operations to minimize information exchange. To our knowledge DistGraph is the first approach demonstrated to scale to graphs with over a billion vertices and edges. Scalability results on up to 2048 IBM Blue Gene/Q compute nodes, with 16 cores each, show very good speedup.
引用
收藏
页码:1024 / 1052
页数:29
相关论文
共 50 条
  • [31] Distributed Pilot Assignment for Distributed Massive-MIMO Networks
    Khan, Mohd Saif Ali
    Agnihotri, Samar
    Karthik, R. M.
    2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
  • [32] Graph neural networks meet with distributed graph partitioners and reconciliations
    Mu, Zongshen
    Tang, Siliang
    Zong, Chang
    Yu, Dianhai
    Zhuang, Yueting
    NEUROCOMPUTING, 2023, 518 : 408 - 417
  • [33] Distributed Massive MIMO for LEO Satellite Networks
    Abdelsadek, Mohammed Y.
    Kurt, Gunes Karabulut
    Yanikomeroglu, Halim
    IEEE OPEN JOURNAL OF THE COMMUNICATIONS SOCIETY, 2022, 3 : 2162 - 2177
  • [34] Graph mining of networks from genome biology
    Chin, George, Jr.
    Nakamura, Grant C.
    Chavarria, Daniel G.
    Sofia, Heidi J.
    PROCEEDINGS OF THE 7TH IEEE INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, VOLS I AND II, 2007, : 1265 - 1269
  • [35] Partial Restreaming Approach For Massive Graph Partitioning
    Echbarthi, Ghizlane
    Kheddouci, Hamamache
    10TH INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY AND INTERNET-BASED SYSTEMS SITIS 2014, 2014, : 677 - 681
  • [36] Managing redundancy in distributed computer networks: A state transition graph approach for the stashing problem
    Walker, B
    Sanso, B
    OPERATIONS RESEARCH, 1998, 46 (03) : 305 - 315
  • [37] Massive Data Mining, Cyber Security Approach
    Guizani, Sghaier
    2018 14TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE (IWCMC), 2018, : 1368 - 1372
  • [38] A Novel Lightweight Middleware for Distributed Massive PMU Data Mining
    Yi, Jianbo
    Dong, Binbin
    Huang, Qi
    PROCEEDINGS OF THE 2017 2ND INTERNATIONAL CONFERENCE ON ELECTRICAL, CONTROL AND AUTOMATION ENGINEERING (ECAE 2017), 2017, 140 : 290 - 294
  • [39] A Framework for Accelerating Graph Convolutional Networks on Massive Datasets
    Li, Xiang
    Jin, Ruoming
    Ramnath, Rajiv
    Agrawal, Gagan
    COMPUTATIONAL DATA AND SOCIAL NETWORKS, CSONET 2021, 2021, 13116 : 79 - 92
  • [40] Distributed Graph Clustering for Application in Wireless Networks
    Yu, Chia-Hao
    Qin, Shaomeng
    Alava, Mikko
    Tirkkonen, Olav
    SELF-ORGANIZING SYSTEMS, 2011, 6557 : 92 - +