Placing big graph into cloud for parallel processing with a two-phase community-aware approach

被引:3
|
作者
Hu, Kekun [1 ,2 ]
Zeng, Guosun [1 ,2 ]
机构
[1] Tongji Univ, Dept Comp Sci & Technol, Shanghai 201804, Peoples R China
[2] Natl Engn & Technol Ctr High Performance Comp, Tongji Branch, Shanghai 201804, Peoples R China
关键词
Cloud computing; Big graph processing; Data placement; Community detection; Scale constraints; Modularity density;
D O I
10.1016/j.future.2019.07.014
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Big graphs are so large that their analysis often rely on the cloud for parallel processing. Data placement, as a key pre-processing step, has a profound impact on the performance of parallel processing. Traditional placement methods fail to preserve graph topologies, leading to poor performance. As the community is the most common structure of big graphs, in this work, we present a two-phase community-aware placement algorithm to place big graphs into the cloud for parallel processing. It can obtain a placement scheme that preserves the community structure well by maximizing the modularity density of the scheme under memory capacity constraints of computational nodes of the cloud in two phases. In the first phase, we design a streaming partitioning heuristic to detect communities based on partial and incomplete graph information. They form an initial placement scheme with relatively high modularity density. To improve it further, in the second phase, we put forward a scale-constrained kernel k-means algorithm. It takes as input the initial placement scheme and iteratively redistributes graph vertices across computational nodes under scale constraints until the modularity density cannot be improved any further. Finally, experiments show that our algorithm can preserve graph topologies well and greatly support parallel processing of big graphs in the cloud. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:1187 / 1200
页数:14
相关论文
共 30 条
  • [1] Community-Aware Graph Signal Processing: Modularity Defines New Ways of Processing Graph Signals
    Miljan, Petrovic
    Liegeois, Raphael
    Bolton, Thomas A. W.
    van de Ville, Dimitri
    IEEE SIGNAL PROCESSING MAGAZINE, 2020, 37 (06) : 150 - 159
  • [2] CR2: Community-aware Compressed Regular Representation for Graph Processing on a GPU
    Jeong, Shinnung
    Cho, Sungjun
    Lee, Yongwoo
    Park, Hyunjun
    Heo, Seonyeong
    Kim, Gwangsun
    Kim, Youngsok
    Kim, Hanjun
    53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024, 2024, : 544 - 554
  • [3] An efficient two-phase approach for reliable collaboration-aware service composition in cloud manufacturing
    Xie, Na
    Tan, Wenan
    Zheng, Xianrong
    Zhao, Lu
    Huang, Li
    Sun, Yong
    JOURNAL OF INDUSTRIAL INFORMATION INTEGRATION, 2021, 23
  • [4] Two-phase grouping-based resource management for big data processing in mobile cloud computing
    Park, JiSu
    Kim, Hyongsoon
    Jeong, Young-Sik
    Lee, Eunyoung
    INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2014, 27 (06) : 839 - 851
  • [5] Two-phase Entropy based approach to Big Data Anonymization
    Ranjan, Ashish
    Ranjan, Prabhat
    2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2016, : 76 - 81
  • [6] A two-phase approach for benefit-driven and correlation-aware service composition allocation in cloud manufacturing
    Tang, Chunhua
    Zhang, Qiang
    Ding, Jiaming
    Zhao, Shuangyao
    Goh, Mark
    ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2025, 95
  • [7] Graph Neural Network based Two-Phase Fault Localization Approach
    Li, Zhengmin
    Tang, Enyi
    Chen, Xin
    Wang, Linzhang
    Li, Xuandong
    13TH ASIA-PACIFIC SYMPOSIUM ON INTERNETWARE, INTERNETWARE 2022, 2022, : 85 - 95
  • [8] Energy Aware Scheduling based on Two-phase Frequency Scaling for Parallel Tasks in Cluster
    Liang, Aihua
    Liang, Jun
    Yuan, Jiazheng
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2015, 8 (05): : 205 - 214
  • [9] TPICDS: A Two-Phase Parallel Approach for Incremental Clustering of Data Streams
    Alazeez, Ammar Al Abd
    Jassim, Sabah
    Du, Hongbo
    EURO-PAR 2018: PARALLEL PROCESSING WORKSHOPS, 2019, 11339 : 5 - 16
  • [10] Cluster-Scheduling Big Graph Traversal Task for Parallel Processing in Heterogeneous Cloud Based on DAG Transformation
    Hu, Kekun
    Zeng, Guosun
    Ding, Shuang
    Jiang, Huowen
    IEEE ACCESS, 2019, 7 : 77070 - 77082