Scalable community detection in massive social networks using MapReduce

被引:13
|
作者
Shi, J. [1 ]
Xue, W. [2 ]
Wang, W. [3 ]
Zhang, Y.
Yang, B. [4 ]
Li, J. [5 ]
机构
[1] IBM Res China, Beijing 100193, Peoples R China
[2] Tencent Inc, Beijing 100080, Peoples R China
[3] Shanghai Synacast Media Tech PPLive Inc, Shanghai 201203, Peoples R China
[4] IBM Software Grp, China Dev Lab, Beijing 100193, Peoples R China
[5] IBM Res Austin, Austin, TX 78758 USA
关键词
MODULARITY;
D O I
10.1147/JRD.2013.2251982
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present a community-detection solution for massive-scale social networks using MapReduce, a parallel programming framework. We use a similarity metric to model the community probability, and the model is designed to be parallelizable and scalable in the MapReduce framework. More importantly, we propose a set of degree-based preprocessing and postprocessing techniques named DEPOLD (DElayed Processing of Large Degree nodes) that significantly improve both the community-detection accuracy and performance. With DEPOLD, delaying analysis of 1% of high-degree nodes to the postprocessing stage reduces both processing time and storage space by one order of magnitude. DEPOLD can be applied to other graph-clustering problems. Furthermore, we design and implement two similarity calculation algorithms using MapReduce with different computation and communication characteristics in order to adapt to various system configurations. Finally, we conduct experiments with publicly available datasets. Our evaluation demonstrates the effectiveness, efficiency, and scalability of the proposed solution.
引用
收藏
页数:14
相关论文
共 50 条
  • [11] Community Detection in Social Networks Using Deep Learning
    Dhilber, M.
    Bhavani, S. Durga
    DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY (ICDCIT 2020), 2020, 11969 : 241 - 250
  • [12] SOCIAL NETWORKS COMMUNITY DETECTION USING THE SHAPLEY VALUE
    Hajibagheri, A. R.
    Hamzeh, A.
    Alvari, H.
    Hashemi, S.
    IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY-TRANSACTIONS OF ELECTRICAL ENGINEERING, 2013, 37 (E1) : 51 - 65
  • [13] Epidemic Disease Propagation Detection Algorithm using MapReduce for Realistic Social Contact Networks
    Ranjan, Rakesh
    Misra, Rajiv
    2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND APPLICATIONS (ICHPCA), 2014,
  • [14] Community Detection in Social Networks
    Su, Chang
    Wang, Yukun
    Yu, Yue
    FRONTIERS OF MANUFACTURING AND DESIGN SCIENCE IV, PTS 1-5, 2014, 496-500 : 2174 - 2177
  • [15] Community detection in social networks
    Bedi, Punam
    Sharma, Chhavi
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2016, 6 (03) : 115 - 135
  • [16] Community structure mining in big data social media networks with MapReduce
    Jin, Songchang
    Lin, Wangqun
    Yin, Hong
    Yang, Shuqiang
    Li, Aiping
    Deng, Bo
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (03): : 999 - 1010
  • [17] Community structure mining in big data social media networks with MapReduce
    Songchang Jin
    Wangqun Lin
    Hong Yin
    Shuqiang Yang
    Aiping Li
    Bo Deng
    Cluster Computing, 2015, 18 : 999 - 1010
  • [18] PASLPA - Overlapping Community Detection in Massive Real Networks Using Apache Spark
    Sedighpour, Navid
    Bagheri, Alireza
    2018 9TH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST), 2018, : 233 - 240
  • [19] Scaling density-based community detection to large-scale social networks via MapReduce framework
    Abulaish, Muhammad
    Bhat, Ishfaq Majid
    Bhat, Sajid Yousuf
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (02) : 1663 - 1674
  • [20] Overlapping community detection in social networks using coalitional games
    Jonnalagadda, Annapurna
    Kuppusamy, Lakshmanan
    KNOWLEDGE AND INFORMATION SYSTEMS, 2018, 56 (03) : 637 - 661