Scalable subgraph enumeration in MapReduce: a cost-oriented approach

被引:16
|
作者
Lai, Longbin [1 ]
Qin, Lu [2 ]
Lin, Xuemin [1 ]
Chang, Lijun [1 ]
机构
[1] Univ New South Wales, Sydney, NSW, Australia
[2] Univ Technol, Ctr QCIS, Sydney, NSW, Australia
来源
VLDB JOURNAL | 2017年 / 26卷 / 03期
基金
澳大利亚研究理事会; 中国国家自然科学基金;
关键词
MapReduce; Subgraph enumeration; Random graph; Power-law graph; ISOMORPHISM;
D O I
10.1007/s00778-017-0459-4
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Subgraph enumeration, which aims to find all the subgraphs of a large data graph that are isomorphic to a given pattern graph, is a fundamental graph problem with a wide range of applications. However, existing sequential algorithms for subgraph enumeration fall short in handling large graphs due to the involvement of computationally intensive subgraph isomorphism operations. Thus, some recent researches focus on solving the problem using MapReduce. Nevertheless, exiting MapReduce approaches are not scalable to handle very large graphs since they either produce a huge number of partial results or consume a large amount of memory. Motivated by this, in this paper, we propose a new algorithm based on a left-deep-join framework in MapReduce, in which the basic join unit is a (an edge or two incident edges of a node). We show that in the Erdos-R,nyi random graph model, is instance optimal in the left-deep-join framework under reasonable assumptions, and we devise an algorithm to compute the optimal join plan. We further discuss how our approach can be adapted to handle the power-law random graph model. Three optimization strategies are explored to improve our algorithm. Ultimately, by aggregating equivalent nodes into a compressed node, we construct the compressed graph, upon which the subgraph enumeration is further improved. We conduct extensive performance studies in several real graphs, one of which contains billions of edges. Our approach significantly outperforms existing solutions in all tests.
引用
收藏
页码:421 / 446
页数:26
相关论文
共 50 条
  • [41] Dual sourcing strategy in cost-oriented and flexibility-oriented suppliers environment
    Pai, Fan-Yun
    [J]. AFRICAN JOURNAL OF BUSINESS MANAGEMENT, 2010, 4 (18): : 4029 - 4034
  • [42] Design and Construction of a Cost-Oriented Mobile Robot for Domestic Assistance
    Pallares O, Brayan S.
    Rozo M, Tatiana A.
    Camacho, Edgar C.
    Guillermo Guarnizo, Jose
    Calderon, Juan M.
    [J]. IFAC PAPERSONLINE, 2021, 54 (13): : 293 - 298
  • [43] A note on "An exact method for cost-oriented assembly line balancing"
    Scholl, A
    Becker, C
    [J]. INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 2005, 97 (03) : 343 - 352
  • [44] A Model of Cost-oriented Price-making for Logistics Service
    Zhou Xingjian
    Du Chengxiang
    Zhu Jieyin
    Yang Weifeng
    [J]. ICPOM2008: PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE OF PRODUCTION AND OPERATION MANAGEMENT, VOLUMES 1-3, 2008, : 254 - 258
  • [45] Reliability- and cost-oriented optimal bridge maintenance planning
    Frangopol, Dan M.
    Miyake, Masaru
    Kong, Jung S.
    Gharaibeh, Emhaidy S.
    Estes, Allen C.
    [J]. Recent Advances in Optimal Structural Design, 2002, : 257 - 270
  • [46] Design and Research for a Cost-Oriented Hybrid Electric Vehicle Architecture
    Zhang, Jianlong
    Wang, Lei
    Yin, Chengliang
    [J]. PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, DETC 2010, VOL 4, 2010, : 229 - 233
  • [47] A Cost-Oriented Optimal Model of Electric Vehicle Taxi Systems
    Liu, Xiang
    Wang, Ning
    Dong, Decun
    [J]. SUSTAINABILITY, 2018, 10 (05)
  • [48] Cost-Oriented Candidate Screening Using Machine Learning Algorithms
    Wild, Shachar
    Last, Mark
    [J]. RECENT CHALLENGES IN INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2022, 2022, 1716 : 737 - 750
  • [49] A Scalable Parallel Approach for Subgraph Census Computation
    Aparicio, David
    Paredes, Pedro
    Ribeiro, Pedro
    [J]. EURO-PAR 2014: PARALLEL PROCESSING WORKSHOPS, PT II, 2014, 8806 : 194 - 205
  • [50] Machine Learning for failure prediction: A cost-oriented model selection
    Tortora, Alessia Maria Rosaria
    Veneroso, Ciele Resende
    Di Pasquale, Valentina
    Riemma, Stefano
    Iannone, Raffaele
    [J]. 5TH INTERNATIONAL CONFERENCE ON INDUSTRY 4.0 AND SMART MANUFACTURING, ISM 2023, 2024, 232 : 3195 - 3205