Set-based approximate approach for lossless graph summarization

被引:21
|
作者
Khan, Kifayat Ullah [1 ]
Nawaz, Waqas [1 ]
Lee, Young-Koo [1 ]
机构
[1] Kyung Hee Univ, Dept Comp Engn, Yongin 446701, Gyeonggi Do, South Korea
基金
新加坡国家研究基金会;
关键词
Graph summarization; LSH; MDL; Degree similarity; Auto pruning; SIMILARITY SEARCH;
D O I
10.1007/s00607-015-0454-9
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Graph summarization is valuable approach to analyze various real life phenomenon, like communities, influential nodes, and information flow in a big graph. To summarize a graph, nodes having similar neighbors are merged into super nodes and their corresponding edges are compressed into super edges. Existing methods find similar nodes either by nodes ordering or perform pairwise similarity computations. Compression-by-node ordering approaches are scalable but provide lesser compression due to exhaustive similarity computations of their counterparts. In this paper, we propose a novel set-based summarization approach that directly summarizes naturally occurring sets of similar nodes in a graph. Our approach is scalable since we avoid explicit similarity computations with non-similar nodes and merge sets of nodes in each iteration. Similarly, we provide good compression ratio as each set consists of highly similar nodes. To locate sets of similar nodes, we find candidate sets of similar nodes by using locality sensitive hashing. However, member nodes of every candidate set have varying similarities with each other. Therefore, we propose a heuristic based on similarity among degrees of candidate nodes, and a parameter-free pruning technique to effectively identify subset of highly similar nodes from candidate nodes. Through experiments on real world graphs, our approach requires lesser execution time than pairwise graph summarization, with margin of an order of magnitude in graphs containing nodes with highly diverse neighborhood, and produces summary at similar accuracy. Similarly, we observe comparable scalability against the compression-by-node ordering method, while providing better compression ratio.
引用
收藏
页码:1185 / 1207
页数:23
相关论文
共 50 条
  • [41] Abstractive Text Summarization based on Improved Semantic Graph Approach
    Atif Khan
    Naomie Salim
    Haleem Farman
    Murad Khan
    Bilal Jan
    Awais Ahmad
    Imran Ahmed
    Anand Paul
    [J]. International Journal of Parallel Programming, 2018, 46 : 992 - 1016
  • [42] An Extractive Malayalam Document Summarization Based on Graph Theoretic Approach
    Ajmal, E. B.
    Haroon, Rosna P.
    [J]. PROCEEDINGS 2015 FIFTH INTERNATIONAL CONFERENCE ON E-LEARNING (ECONF 2015), 2015, : 237 - 240
  • [43] A Hierarchical Parallel Graph Summarization Approach Based on Ranking Nodes
    Liu, Qiang
    Wei, Jiaxing
    Liu, Hao
    Ji, Yimu
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (08):
  • [44] Instance-Based Lossless Summarization of Knowledge Graph With Optimized Triples and Corrections (IBA-OTC)
    Javed, Hafiz Tayyeb
    Khan, Kifayat Ullah
    Cheema, Muhammad Faisal
    Algarni, Asaad
    Park, Jeongmin
    [J]. IEEE ACCESS, 2024, 12 : 5584 - 5604
  • [45] Set-Based Extended Functions
    Mesiar, Radko
    Kolesarova, Anna
    Seliga, Adam
    Montero, Javier
    Gomez, Daniel
    [J]. MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE (MDAI 2019), 2019, 11676 : 41 - 51
  • [46] On Set-Based Multiobjective Optimization
    Zitzler, Eckart
    Thiele, Lothar
    Bader, Johannes
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2010, 14 (01) : 58 - 79
  • [47] On the complexity of set-based analysis
    Heintze, N
    McAllester, D
    [J]. ACM SIGPLAN NOTICES, 1997, 32 (08) : 150 - 163
  • [48] What Is Set-Based Design?
    Singer, David J.
    Doerry, Norbert
    Buckley, Michael E.
    [J]. NAVAL ENGINEERS JOURNAL, 2009, 121 (04) : 31 - 43
  • [49] A set-based model of design
    Sobek, DK
    [J]. MECHANICAL ENGINEERING, 1996, 118 (07) : 78 - 81
  • [50] Componential set-based analysis
    Flanagan, C
    Felleisen, M
    [J]. ACM SIGPLAN NOTICES, 1997, 32 (05) : 235 - 248