Incremental Lossless Graph Summarization

被引:27
|
作者
Ko, Jihoon [1 ]
Kook, Yunbum [2 ]
Shin, Kijung [3 ]
机构
[1] Korea Adv Inst Sci & Technol, AI, Daejeon, South Korea
[2] Korea Adv Inst Sci & Technol, Dept Math Sci, Daejeon, South Korea
[3] Korea Adv Inst Sci & Technol, AI & EE, Daejeon, South Korea
基金
新加坡国家研究基金会;
关键词
D O I
10.1145/3394486.3403074
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Given a fully dynamic graph, represented as a stream of edge insertions and deletions, how can we obtain and incrementally update a lossless summary of its current snapshot? As large-scale graphs are prevalent, concisely representing them is inevitable for efficient storage and analysis. Lossless graph summarization is an effective graph-compression technique with many desirable properties. It aims to compactly represent the input graph as (a) a summary graph consisting of supernodes (i.e., sets of nodes) and superedges (i.e., edges between supernodes), which provide a rough description, and (b) edge corrections which fix errors induced by the rough description. While a number of batch algorithms, suited for static graphs, have been developed for rapid and compact graph summarization, they are highly inefficient in terms of time and space for dynamic graphs, which are common in practice. In this work, we propose MoSSo, the first incremental algorithm for lossless summarization of fully dynamic graphs. In response to each change in the input graph, MoSSo updates the output representation by repeatedly moving nodes among supernodes. MoSSo decides nodes to be moved and their destinations carefully but rapidly based on several novel ideas. Through extensive experiments on 10 real graphs, we show MoSSo is (a) Fast and `any time': processing each change in near-constant time (less than 0.1 millisecond), up to 7 orders of magnitude faster than running state-of-the-art batch methods, (b) Scalable: summarizing graphs with hundreds of millions of edges, requiring sub-linear memory during the process, and (c) Effective: achieving comparable compression ratios even to state-of-the-art batch methods.
引用
收藏
页码:317 / 327
页数:11
相关论文
共 50 条
  • [1] Lossless Graph Summarization using Dense Subgraphs Discovery
    Khan, Kifayat Ullah
    Nawaz, Waqas
    Lee, Young-Koo
    [J]. ACM IMCOM 2015, PROCEEDINGS, 2015,
  • [2] Set-based approximate approach for lossless graph summarization
    Khan, Kifayat Ullah
    Nawaz, Waqas
    Lee, Young-Koo
    [J]. COMPUTING, 2015, 97 (12) : 1185 - 1207
  • [3] A Parameter-Free Approach for Lossless Streaming Graph Summarization
    Ma, Ziyi
    Yang, Jianye
    Li, Kenli
    Liu, Yuling
    Zhou, Xu
    Hu, Yikun
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT I, 2021, 12681 : 385 - 393
  • [4] Set-based approximate approach for lossless graph summarization
    Kifayat Ullah Khan
    Waqas Nawaz
    Young-Koo Lee
    [J]. Computing, 2015, 97 : 1185 - 1207
  • [5] Sentiment Lossless Summarization
    Li, Xiaodong
    Wu, Pangjing
    Zou, Chenxin
    Xie, Haoran
    Wang, Fu Lee
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 227
  • [6] Set-based Approach for Lossless Graph Summarization using Locality Sensitive Hashing
    Khan, Kifayat Ullah
    Lee, Young-Koo
    [J]. 2015 13TH IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW), 2015, : 255 - 259
  • [7] Instance-Based Lossless Summarization of Knowledge Graph With Optimized Triples and Corrections (IBA-OTC)
    Javed, Hafiz Tayyeb
    Khan, Kifayat Ullah
    Cheema, Muhammad Faisal
    Algarni, Asaad
    Park, Jeongmin
    [J]. IEEE ACCESS, 2024, 12 : 5584 - 5604
  • [8] Incremental Summarization using Taxonomy
    Choi, DongHyun
    Choi, Key-Sun
    [J]. K-CAP'09: PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON KNOWLEDGE CAPTURE, 2009, : 175 - 176
  • [9] Graph Embedding via Graph Summarization
    Yang, Jingyanning
    You, Jinguo
    Wan, Xiaorong
    [J]. IEEE ACCESS, 2021, 9 : 45163 - 45174
  • [10] Graph Stream Summarization
    Tang, Nan
    Chen, Qing
    Mitra, Prasenjit
    [J]. SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 1481 - 1496