RECEIPT: REfine CoarsE-grained IndePendent Tasks for Parallel Tip decomposition of Bipartite Graphs

被引:6
|
作者
Lakhotia, Kartik [1 ]
Kannan, Rajgopal [2 ]
Prasanna, Viktor [1 ]
De Rose, Cesar A. F. [3 ]
机构
[1] Univ Southern Calif, Ming Hsieh Dept Elect Engn, Los Angeles, CA 90007 USA
[2] USA Res Lab, Los Angeles, CA 90094 USA
[3] Pontificia Univ Catolica Rio Grande do Sul, Sch Technol, Porto Alegre, RS, Brazil
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2020年 / 14卷 / 03期
基金
美国国家科学基金会;
关键词
ALGORITHMS;
D O I
10.14778/3430915.3430929
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Tip decomposition is a crucial kernel for mining dense subgraphs in bipartite networks, with applications in spam detection, analysis of affiliation networks etc. It creates a hierarchy of vertex-induced subgraphs with varying densities determined by the participation of vertices in butterflies (2, 2-bicliques). To build the hierarchy, existing algorithms iteratively follow a delete-update(peeling) process: deleting vertices with the minimum number of butterflies and correspondingly updating the butterfly count of their 2-hop neighbors. The need to explore 2-hop neighborhood renders tip-decomposition computationally very expensive. Furthermore, the inherent sequentiality in peeling only minimum butterfly vertices makes derived parallel algorithms prone to heavy synchronization. In this paper, we propose a novel parallel tip-decomposition algorithm - REfine CoarsE-grained Independent Tasks (RECEIPT) that relaxes the peeling order restrictions by partitioning the vertices into multiple independent subsets that can be concurrently peeled. This enables RECEIPT to simultaneously achieve a high degree of parallelism and dramatic reduction in synchronizations. Further, RECEIPT employs a hybrid peeling strategy along with other optimizations that drastically reduce the amount of wedge exploration and execution time. We perform detailed experimental evaluation of RECEIPT on a shared-memory multicore server. It can process some of the largest publicly available bipartite datasets orders of magnitude faster than the state-of-the-art algorithms - achieving up to 1100x and 64x reduction in the number of thread synchronizations and traversed wedges, respectively. Using 36 threads, RECEIPT can provide up to 17.1x self-relative speedup.
引用
收藏
页码:404 / 417
页数:14
相关论文
共 50 条
  • [21] Compiling parallel applications to Coarse-Grained Reconfigurable Architectures
    Tuhin, Mohammed Ashraful Alam
    Norvell, Theodore S.
    2008 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-4, 2008, : 1649 - +
  • [22] Coarse-Grained Parallel Routing With Recursive Partitioning for FPGAs
    Shen, Minghua
    Luo, Guojie
    Xiao, Nong
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (04) : 884 - 899
  • [23] CONDOR - A COARSE-GRAINED PARALLEL ARCHITECTURE FOR ROBOT CONTROL
    NARASIMHAN, S
    SIEGEL, DM
    HOLLERBACH, JM
    PROCEEDINGS OF THE 1989 AMERICAN CONTROL CONFERENCE, VOLS 1-3, 1989, : 484 - 488
  • [24] Integer sorting algorithms for coarse-grained parallel machines
    Alsabti, K
    Ranka, S
    FOURTH INTERNATIONAL CONFERENCE ON HIGH-PERFORMANCE COMPUTING, PROCEEDINGS, 1997, : 159 - 164
  • [25] Practical algorithms for selection on coarse-grained parallel computers
    Alfuraih, I
    Aluru, S
    Goil, S
    Ranka, S
    10TH INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM - PROCEEDINGS OF IPPS '96, 1996, : 309 - 313
  • [26] Filter decomposition for supporting coarse-grained pipelined parallelism
    Du, W
    Agrawal, G
    2005 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSSING, PROCEEDINGS, 2005, : 539 - 546
  • [27] Distributed Algorithm for Tip Decomposition on Large Bipartite Graphs
    Zhou X.
    Weng T.-F.
    Yang Z.-B.
    Li B.-R.
    Zhang J.
    Li K.-L.
    Ruan Jian Xue Bao/Journal of Software, 2022, 33 (03): : 1043 - 1056
  • [28] Near-optimal dynamic task scheduling of independent coarse-grained tasks onto a computational grid
    Fujimoto, N
    Hagihara, K
    2003 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, PROCEEDINGS, 2003, : 391 - 398
  • [29] Efficient Execution of Stream Graphs on Coarse-Grained Reconfigurable Architectures
    Oh, Sangyun
    Lee, Hongsik
    Lee, Jongeun
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2017, 36 (12) : 1978 - 1988
  • [30] Building large phylogenetic trees on coarse-grained parallel machines
    Keane, Thomas M.
    Page, Andrew J.
    Naughton, Thomas J.
    Travers, Simon A. A.
    McInerney, James O.
    ALGORITHMICA, 2006, 45 (03) : 285 - 300