Fast spectral clustering learning with hierarchical bipartite graph for large-scale data

被引:46
|
作者
Yang, Xiaojun [1 ]
Yu, Weizhong [2 ]
Wang, Rong [3 ]
Zhang, Guohao [1 ]
Nie, Feiping [3 ]
机构
[1] Guangdong Univ Technol, Sch Informat Engn, Guangzhou 510006, Peoples R China
[2] Xi An Jiao Tong Univ, Sch Elect & Informat Engn, Xian 710049, Peoples R China
[3] Northwestern Polytech Univ, Ctr OPT IMagery Anal & Learning OPTIMAL, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Spectral clustering; Hierarchical graph; Bipartite graph; Large scale data; Out-of-sample;
D O I
10.1016/j.patrec.2018.06.024
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spectral clustering (SC) is drawing more and more attention due to its effectiveness in unsupervised learning. However, all of these methods still have limitations. First, the method is not suitable for large-scale problems due to its high computational complexity. Second, the neighborhood weighted graph is constructed by the Gaussian kernel, meaning that more work is required to tune the heat-kernel parameter. In order to overcome these issues, we propose a novel spectral clustering based on hierarchical bipartite graph (SCHBG) approach by exploring multiple-layer anchors with a pyramid-style structure. First, the proposed algorithm constructs a hierarchical bipartite graph, and then performs spectral analysis on the graph. As a result, the computational complexity can be largely reduced. Furthermore, we adopt a parameter-free yet effective neighbor assignment strategy to construct the similarity matrix, which avoids the need to tune the heat-kernel parameter. Finally, the algorithm is able to deal with the out-of-sample problem for large-scale data and its computational complexity is significantly reduced. Experiments demonstrate the efficiency and effectiveness of the proposed SCHBG algorithm. Results show that the SCHBG approach can achieve good clustering accuracy (76%) on an 8-million datasets. Furthermore, owing to the use of the bipartite graph, the algorithm can reduce the time cost for out-of-sample situations with almost the same clustering accuracy as for large sizes of data. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:345 / 352
页数:8
相关论文
共 50 条
  • [41] Fast Placement for Large-scale Hierarchical FPGAs
    Dai, Hui
    Zhou, Qiang
    Cai, Yici
    Bian, Jinian
    Hong, Xianlong
    [J]. 2009 11TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN AND COMPUTER GRAPHICS, PROCEEDINGS, 2009, : 190 - 194
  • [42] Large-scale parallel data clustering
    Judd, D
    McKinley, PK
    Jain, AK
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (08) : 871 - 876
  • [43] Inference of mouse heart-specific subnetwork from large-scale data compendium using spectral graph clustering
    Hong, Seong-Eui
    Park, Inju
    Cha, Hyeseon
    Rho, Seong-Hwan
    Park, Woo Jin
    Cho, Chunghee
    Kim, Do Han
    [J]. BIOPHYSICAL JOURNAL, 2007, : 646A - 646A
  • [44] Adaptive Neighbors Graph Learning for Large-Scale Data Clustering using Vector Quantization and Self-Regularization
    Cai, Yongda
    Huang, Joshua Zhexue
    Ngueilbaye, Alladoumbaye
    Sun, Xudong
    [J]. APPLIED SOFT COMPUTING, 2024, 167
  • [45] A distributed and incremental algorithm for large-scale graph clustering
    Inoubli, Wissem
    Aridhi, Sabeur
    Mezni, Haithem
    Maddouri, Mondher
    Nguifo, Engelbert Mephu
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2022, 134 : 334 - 347
  • [46] A Novel Clustering Algorithm for Large-Scale Graph Processing
    Qu, Zhaoyang
    Ding, Wei
    Qu, Nan
    Yan, Jia
    Wang, Ling
    [J]. INTELLIGENT COMPUTING METHODOLOGIES, ICIC 2016, PT III, 2016, 9773 : 349 - 358
  • [47] Fast Spectral Clustering With Anchor Graph for Large Hyperspectral Images
    Wang, Rong
    Nie, Feiping
    Yu, Weizhong
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2017, 14 (11) : 2003 - 2007
  • [48] Hyperspectral Fast Spectral Clustering Algorithm Based on Multi-Layer Bipartite Graph
    Li Siyuan
    Zheng Zhiyuan
    Du Xiaoyan
    Liu Tong
    Yang Xiaojun
    [J]. LASER & OPTOELECTRONICS PROGRESS, 2022, 59 (12)
  • [49] Fast Clustering by Directly Solving Bipartite Graph Clustering Problem
    Nie, Feiping
    Xue, Jingjing
    Wang, Rong
    Zhang, Liang
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (07) : 9174 - 9185
  • [50] Large-Scale Spectral Clustering Based on Representative Points
    Yang, Libo
    Liu, Xuemei
    Nie, Feiping
    Liu, Mingtang
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2019, 2019