Set-based approximate approach for lossless graph summarization

被引:21
|
作者
Khan, Kifayat Ullah [1 ]
Nawaz, Waqas [1 ]
Lee, Young-Koo [1 ]
机构
[1] Kyung Hee Univ, Dept Comp Engn, Yongin 446701, Gyeonggi Do, South Korea
基金
新加坡国家研究基金会;
关键词
Graph summarization; LSH; MDL; Degree similarity; Auto pruning; SIMILARITY SEARCH;
D O I
10.1007/s00607-015-0454-9
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Graph summarization is valuable approach to analyze various real life phenomenon, like communities, influential nodes, and information flow in a big graph. To summarize a graph, nodes having similar neighbors are merged into super nodes and their corresponding edges are compressed into super edges. Existing methods find similar nodes either by nodes ordering or perform pairwise similarity computations. Compression-by-node ordering approaches are scalable but provide lesser compression due to exhaustive similarity computations of their counterparts. In this paper, we propose a novel set-based summarization approach that directly summarizes naturally occurring sets of similar nodes in a graph. Our approach is scalable since we avoid explicit similarity computations with non-similar nodes and merge sets of nodes in each iteration. Similarly, we provide good compression ratio as each set consists of highly similar nodes. To locate sets of similar nodes, we find candidate sets of similar nodes by using locality sensitive hashing. However, member nodes of every candidate set have varying similarities with each other. Therefore, we propose a heuristic based on similarity among degrees of candidate nodes, and a parameter-free pruning technique to effectively identify subset of highly similar nodes from candidate nodes. Through experiments on real world graphs, our approach requires lesser execution time than pairwise graph summarization, with margin of an order of magnitude in graphs containing nodes with highly diverse neighborhood, and produces summary at similar accuracy. Similarly, we observe comparable scalability against the compression-by-node ordering method, while providing better compression ratio.
引用
收藏
页码:1185 / 1207
页数:23
相关论文
共 50 条
  • [1] Set-based approximate approach for lossless graph summarization
    Kifayat Ullah Khan
    Waqas Nawaz
    Young-Koo Lee
    [J]. Computing, 2015, 97 : 1185 - 1207
  • [2] Set-based Approach for Lossless Graph Summarization using Locality Sensitive Hashing
    Khan, Kifayat Ullah
    Lee, Young-Koo
    [J]. 2015 13TH IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW), 2015, : 255 - 259
  • [3] Set-based Unified Approach for Attributed Graph Summarization
    Khan, Kifayat Ullah
    Nawaz, Waqas
    Lee, Young-Koo
    [J]. 2014 IEEE FOURTH INTERNATIONAL CONFERENCE ON BIG DATA AND CLOUD COMPUTING (BDCLOUD), 2014, : 378 - 385
  • [4] Set-based unified approach for summarization of a multi-attributed graph
    Kifayat Ullah Khan
    Waqas Nawaz
    Young-Koo Lee
    [J]. World Wide Web, 2017, 20 : 543 - 570
  • [5] Set-based unified approach for summarization of a multi-attributed graph
    Khan, Kifayat Ullah
    Nawaz, Waqas
    Lee, Young-Koo
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2017, 20 (03): : 543 - 570
  • [6] SaintEtiQ: a fuzzy set-based approach to database summarization
    Raschia, G
    Mouaddib, N
    [J]. FUZZY SETS AND SYSTEMS, 2002, 129 (02) : 137 - 162
  • [7] Incremental Lossless Graph Summarization
    Ko, Jihoon
    Kook, Yunbum
    Shin, Kijung
    [J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 317 - 327
  • [8] A Parameter-Free Approach for Lossless Streaming Graph Summarization
    Ma, Ziyi
    Yang, Jianye
    Li, Kenli
    Liu, Yuling
    Zhou, Xu
    Hu, Yikun
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT I, 2021, 12681 : 385 - 393
  • [9] RDF Graph Summarization Based on Approximate Patterns
    Zneika, Mussab
    Lucchese, Claudio
    Vodislav, Dan
    Kotzinos, Dimitris
    [J]. INFORMATION SEARCH, INTEGRATION, AND PERSONALIZATION, (ISIP 2015), 2016, 622 : 69 - 87
  • [10] An alternative perspective on fuzzy set-based approximate reasoning
    Lewis, HW
    [J]. INTERNATIONAL JOURNAL OF GENERAL SYSTEMS, 1997, 26 (1-2) : 97 - 114