Strong consistency guarantees for clustering high-dimensional bipartite graphs with the spectral method

被引:0
|
作者
Braun, Guillaume [1 ]
机构
[1] RIKEN AIP, Tokyo, Japan
来源
ELECTRONIC JOURNAL OF STATISTICS | 2024年 / 18卷 / 02期
关键词
Spectral method; clustering; bipartite stochastic block model; MATRICES;
D O I
10.1214/24-EJS2271
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We investigate the problem of clustering bipartite graphs using a simple spectral method within the framework of the Bipartite Stochastic Block Model (BiSBM), a popular model for bipartite graphs having a community structure. Our focus lies in the high-dimensional setting where the number n 1 of rows, and n 2 of columns, of the associated adjacency matrix differ significantly. A recent study by [4] has established a sufficient and necessary condition related to the sparsity level p max of the bipartite graph, enabling the recovery of the latent partition of the rows. In their work, [4] introduces an iterative method that extends the approach proposed by [26] to achieve the stated recovery goal. However, empirical results suggest that the subsequent refinement algorithm does not significantly enhance the performance of the spectral method, indicating that the spectral method achieves exact recovery within the same regime as the refinement method. We establish this claim by deriving new entrywise bounds on the eigenvectors of the similarity matrix utilized by the spectral method. Our analysis extends the framework of [23], which is limited to symmetric matrices with restricted dependencies. As a critical technical step, we also derive an improved concentration inequality tailored for similarity matrices.
引用
收藏
页码:2798 / 2823
页数:26
相关论文
共 50 条
  • [1] SPECTRAL CLUSTERING AND THE HIGH-DIMENSIONAL STOCHASTIC BLOCKMODEL
    Rohe, Karl
    Chatterjee, Sourav
    Yu, Bin
    [J]. ANNALS OF STATISTICS, 2011, 39 (04): : 1878 - 1915
  • [2] Multiview Spectral Clustering of High-Dimensional Observational Data
    Roman-Messina, A.
    Castro-Arvizu, Claudia M.
    Castillo-Tapia, Alejandro
    Murillo-Aguirre, Erlan R.
    Rodriguez-Villalon, O.
    [J]. IEEE ACCESS, 2023, 11 : 115884 - 115893
  • [3] An Initialization Method for Clustering High-Dimensional Data
    Chen, Luying
    Chen, Lifei
    Jiang, Qingshan
    Wang, Beizhan
    Shi, Liang
    [J]. FIRST INTERNATIONAL WORKSHOP ON DATABASE TECHNOLOGY AND APPLICATIONS, PROCEEDINGS, 2009, : 444 - +
  • [4] Statistical guarantees for local spectral clustering on random neighborhood graphs
    Green, Alden
    Balakrishnan, Sivaraman
    Tibshirani, Ryan J.
    [J]. Journal of Machine Learning Research, 2021, 22
  • [5] High-dimensional clustering method for high performance data mining
    Chang, Jae-Woo
    Lee, Hyun-Jo
    [J]. COMPUTATIONAL SCIENCE - ICCS 2007, PT 3, PROCEEDINGS, 2007, 4489 : 621 - +
  • [6] An Improved Initialization Method for Clustering High-Dimensional Data
    Zhang, Yanping
    Jiang, Qingshan
    [J]. 2010 2ND INTERNATIONAL WORKSHOP ON DATABASE TECHNOLOGY AND APPLICATIONS PROCEEDINGS (DBTA), 2010,
  • [7] An efficient clustering method for high-dimensional data mining
    Chang, JW
    Kim, YK
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE - SBIA 2004, 2004, 3171 : 276 - 285
  • [8] Strong Consistency of Spectral Clustering for Stochastic Block Models
    Su, Liangjun
    Wang, Wuyi
    Zhang, Yichong
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2020, 66 (01) : 324 - 338
  • [9] Spectral Clustering of High-dimensional Data via Nonnegative Matrix Factorization
    Wang, Shulin
    Chen, Fang
    Fang, Jianwen
    [J]. 2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [10] Global and Local Structure Preservation for Nonlinear High-dimensional Spectral Clustering
    Wen, Guoqiu
    Zhu, Yonghua
    Chen, Linjun
    Zhan, Mengmeng
    Xie, Yangcai
    [J]. COMPUTER JOURNAL, 2021, 64 (07): : 993 - 1004