Preconditioned Spectral Clustering for Stochastic Block Partition Streaming Graph Challenge

被引:0
|
作者
Zhuzhunashvili, David [1 ]
Knyazev, Andrew [2 ]
机构
[1] Univ Colorado, Boulder, CO 80309 USA
[2] MERL, 201 Broadway,8th Floor, Cambridge, MA 02139 USA
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is demonstrated to efficiently solve eigen-value problems for graph Laplacians that appear in spectral clustering. For static graph partitioning, 10-20 iterations of LOBPCG without preconditioning result in (similar to)10x error reduction, enough to achieve 100% correctness for all Challenge datasets with known truth partitions, e.g., for graphs with 5K/.1M (50K/1M) Vertices/Edges in 2 (7) seconds, compared to over 5,000 (30,000) seconds needed by the baseline Python code. Our Python code 100% correctly determines 98 (160) clusters from the Challenge static graphs with 0.5M (2M) vertices in 270 (1,700) seconds using 10GB (50GB) of memory. Our single-precision MATLAB code calculates the same clusters at half time and memory. For streaming graph partitioning, LOBPCG is initiated with approximate eigenvectors of the graph Laplacian already computed for the previous graph, in many cases reducing 2-3 times the number of required LOBPCG iterations, compared to the static case. Our spectral clustering is generic, i.e. assuming nothing specific of the block model or streaming, used to generate the graphs for the Challenge, in contrast to the base code. Nevertheless, in 10-stage streaming comparison with the base code for the 5K graph, the quality of our clusters is similar or better starting at stage 4 (7) for emerging edging (snowballing) streaming, while the computations are over 100-1000 faster.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Streaming Graph Challenge: Stochastic Block Partition
    Kao, Edward
    Gadepally, Vijay
    Hurley, Michael
    Jones, Michael
    Kepner, Jeremy
    Mohindra, Sanjeev
    Monticciolo, Paul
    Reuther, Albert
    Samsi, Siddharth
    Song, William
    Staheli, Diane
    Smith, Steven
    [J]. 2017 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2017,
  • [2] Fast Stochastic Block Partition for Streaming Graphs
    Uppal, Ahsen J.
    Huang, H. Howie
    [J]. 2018 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2018,
  • [3] A review on spectral clustering and stochastic block models
    Mina Baek
    Choongrak Kim
    [J]. Journal of the Korean Statistical Society, 2021, 50 : 818 - 831
  • [4] CONSISTENCY OF SPECTRAL CLUSTERING IN STOCHASTIC BLOCK MODELS
    Lei, Jing
    Rinaldo, Alessandro
    [J]. ANNALS OF STATISTICS, 2015, 43 (01): : 215 - 237
  • [5] A review on spectral clustering and stochastic block models
    Baek, Mina
    Kim, Choongrak
    [J]. JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2021, 50 (03) : 818 - 831
  • [6] Spectral clustering in the dynamic stochastic block model
    Pensky, Marianna
    Zhang, Teng
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2019, 13 (01): : 678 - 709
  • [7] Attributed graph clustering with subspace stochastic block model
    Chen, Haoran
    Yu, Zhongjing
    Yang, Qinli
    Shao, Junming
    [J]. INFORMATION SCIENCES, 2020, 535 : 130 - 141
  • [8] A review of stochastic block models and extensions for graph clustering
    Lee, Clement
    Wilkinson, Darren J.
    [J]. APPLIED NETWORK SCIENCE, 2019, 4 (01)
  • [9] A review of stochastic block models and extensions for graph clustering
    Clement Lee
    Darren J. Wilkinson
    [J]. Applied Network Science, 4
  • [10] Hypergraph Spectral Clustering in the Weighted Stochastic Block Model
    Ahn, Kwangjun
    Lee, Kangwook
    Suh, Changho
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2018, 12 (05) : 959 - 974