Scalable community detection in massive social networks using MapReduce

被引:13
|
作者
Shi, J. [1 ]
Xue, W. [2 ]
Wang, W. [3 ]
Zhang, Y.
Yang, B. [4 ]
Li, J. [5 ]
机构
[1] IBM Res China, Beijing 100193, Peoples R China
[2] Tencent Inc, Beijing 100080, Peoples R China
[3] Shanghai Synacast Media Tech PPLive Inc, Shanghai 201203, Peoples R China
[4] IBM Software Grp, China Dev Lab, Beijing 100193, Peoples R China
[5] IBM Res Austin, Austin, TX 78758 USA
关键词
MODULARITY;
D O I
10.1147/JRD.2013.2251982
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present a community-detection solution for massive-scale social networks using MapReduce, a parallel programming framework. We use a similarity metric to model the community probability, and the model is designed to be parallelizable and scalable in the MapReduce framework. More importantly, we propose a set of degree-based preprocessing and postprocessing techniques named DEPOLD (DElayed Processing of Large Degree nodes) that significantly improve both the community-detection accuracy and performance. With DEPOLD, delaying analysis of 1% of high-degree nodes to the postprocessing stage reduces both processing time and storage space by one order of magnitude. DEPOLD can be applied to other graph-clustering problems. Furthermore, we design and implement two similarity calculation algorithms using MapReduce with different computation and communication characteristics in order to adapt to various system configurations. Finally, we conduct experiments with publicly available datasets. Our evaluation demonstrates the effectiveness, efficiency, and scalability of the proposed solution.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Local Community Detection Using Social Relations and Topic Features in Social Networks
    Xu, Chengcheng
    Zhang, Huaping
    Lu, Bingbing
    Wu, Songze
    CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2017, 2017, 10565 : 371 - 383
  • [32] Scalable remote homology detection and fold recognition in massive protein networks
    Petegrosso, Raphael
    Li, Zhuliu
    Srour, Molly A.
    Saad, Yousef
    Zhang, Wei
    Kuang, Rui
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2019, 87 (06) : 478 - 491
  • [33] Multi-resolution community detection in massive networks
    Jihui Han
    Wei Li
    Weibing Deng
    Scientific Reports, 6
  • [34] Multi-resolution community detection in massive networks
    Han, Jihui
    Li, Wei
    Deng, Weibing
    SCIENTIFIC REPORTS, 2016, 6
  • [35] Engineering Parallel Algorithms for Community Detection in Massive Networks
    Staudt, Christian L.
    Meyerhenke, Henning
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (01) : 171 - 184
  • [36] Scalable and Timely Detection of Cyberbullying in Online Social Networks
    Ibn Rafiq, Rahat
    Hosseinmardi, Homa
    Han, Richard
    Lv, Qin
    Mishra, Shivakant
    33RD ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2018, : 1738 - 1747
  • [37] Stratified-Sampling over Social Networks Using MapReduce
    Levin, Roy
    Kanza, Yaron
    SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2014, : 863 - 874
  • [38] Probabilistic Community Detection in Social Networks
    Souravlas, Stavros
    Anastasiadou, Sofia D.
    Economides, Theodore
    Katsavounis, Stefanos
    IEEE ACCESS, 2023, 11 : 25629 - 25641
  • [39] Community detection in blockchain social networks
    Wu, Sissi Xiaoxiao
    Wu, Zixian
    Chen, Shihui
    Li, Gangqiang
    Zhang, Shengli
    Journal of Communications and Information Networks, 2021, 6 (01) : 59 - 71
  • [40] Emotional community detection in social networks
    Kanavos, Andreas
    Perikos, Isidoros
    Hatzilygeroudis, Ioannis
    Tsakalidis, Athanasios
    COMPUTERS & ELECTRICAL ENGINEERING, 2018, 65 : 449 - 460