Scalable community detection in massive social networks using MapReduce

被引:13
|
作者
Shi, J. [1 ]
Xue, W. [2 ]
Wang, W. [3 ]
Zhang, Y.
Yang, B. [4 ]
Li, J. [5 ]
机构
[1] IBM Res China, Beijing 100193, Peoples R China
[2] Tencent Inc, Beijing 100080, Peoples R China
[3] Shanghai Synacast Media Tech PPLive Inc, Shanghai 201203, Peoples R China
[4] IBM Software Grp, China Dev Lab, Beijing 100193, Peoples R China
[5] IBM Res Austin, Austin, TX 78758 USA
关键词
MODULARITY;
D O I
10.1147/JRD.2013.2251982
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present a community-detection solution for massive-scale social networks using MapReduce, a parallel programming framework. We use a similarity metric to model the community probability, and the model is designed to be parallelizable and scalable in the MapReduce framework. More importantly, we propose a set of degree-based preprocessing and postprocessing techniques named DEPOLD (DElayed Processing of Large Degree nodes) that significantly improve both the community-detection accuracy and performance. With DEPOLD, delaying analysis of 1% of high-degree nodes to the postprocessing stage reduces both processing time and storage space by one order of magnitude. DEPOLD can be applied to other graph-clustering problems. Furthermore, we design and implement two similarity calculation algorithms using MapReduce with different computation and communication characteristics in order to adapt to various system configurations. Finally, we conduct experiments with publicly available datasets. Our evaluation demonstrates the effectiveness, efficiency, and scalability of the proposed solution.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Scalable Community Detection from Networks by Computing Edge Betweenness on MapReduce
    Moon, Seunghyeon
    Lee, Jae-Gil
    Kang, Minseo
    2014 INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2014, : 145 - 148
  • [2] Distributed Clique Percolation based Community Detection on Social Networks using MapReduce
    Varamesh, Ali
    Akbari, Mohammad Kazem
    Fereiduni, Mehdi
    Sharifian, Saeed
    Bagheri, Alireza
    2013 5TH CONFERENCE ON INFORMATION AND KNOWLEDGE TECHNOLOGY (IKT), 2013, : 478 - 483
  • [3] Scalable High-Performance Community Detection Using Label Propagation in Massive Networks
    Boddu, Sharon
    Khan, Maleq
    SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2024, PT I, 2025, 15211 : 3 - 19
  • [4] An Improved Community Detection Method in Massive Social Networks
    Yao, Yong
    Li, Bian
    Peng, Lei
    Liu, Zhijing
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTOMATION, MECHANICAL CONTROL AND COMPUTATIONAL ENGINEERING, 2015, 124 : 686 - 691
  • [5] A new scalable leader-community detection approach for community detection in social networks
    Ahajjam, Sara
    El Haddad, Mohamed
    Badir, Hassan
    SOCIAL NETWORKS, 2018, 54 : 41 - 49
  • [6] Scalable Multi-threaded Community Detection in Social Networks
    Riedy, Jason
    Bader, David A.
    Meyerhenke, Henning
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 1619 - 1628
  • [7] A scalable geometric algorithm for community detection from social networks with incremental update
    Surendran S.
    Chithraprasad D.
    Kaimal M.R.
    Social Network Analysis and Mining, 2016, 6 (1)
  • [8] Scalable Influence Maximization in Social Networks using the Community Discovery Algorithm
    Li, Jinshuang
    Yu, Yangyang
    2012 SIXTH INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTING (ICGEC), 2012, : 284 - 287
  • [9] Community detection using multitopology and attributes in social networks
    Liu, Changzheng
    Huang, Fengling
    Li, Ruixuan
    Yang, Qi
    Li, Yuhua
    Yu, Shui
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (12):
  • [10] Community Detection in Social Networks Using Information Diffusion
    Hajibagheri, Alireza
    Alvari, Hamidreza
    Hamzeh, Ali
    Hashemi, Sattar
    2012 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), 2012, : 702 - 703