Fast and scalable support vector clustering for large-scale data analysis

被引:0
|
作者
Yuan Ping
Yun Feng Chang
Yajian Zhou
Ying Jie Tian
Yi Xian Yang
Zhili Zhang
机构
[1] Xuchang University,School of Information Engineering
[2] Beijing University of Posts and Telecommunications,Information Security Center
[3] China Three Gorges University YiChang,College of Science
[4] Graduate University of Chinese Academy of Sciences,School of Information Engineering
[5] Beijing Institute of Graphic Communication,undefined
来源
关键词
Large-scale problem; Support vector clustering; Convex decomposition; Cluster boundary; Cluster labeling;
D O I
暂无
中图分类号
学科分类号
摘要
As an important boundary-based clustering algorithm, support vector clustering (SVC) benefits multiple applications for its capability of handling arbitrary cluster shapes. However, its popularity is degraded by both its highly intensive pricey computation and poor label performance which are due to redundant kernel function matrix required by estimating a support function and ineffectively checking segmers between all pairs of data points, respectively. To address these two problems, a fast and scalable SVC (FSSVC) method is proposed in this paper to achieve significant improvement on efficiency while guarantees a comparable accuracy with the state-of-the-art methods. The heart of our approach includes (1) constructing the hypersphere and support function by cluster boundaries which prunes unnecessary computation and storage of kernel functions and (2) presenting an adaptive labeling strategy which decomposes clusters into convex hulls and then employs a convex-decomposition-based cluster labeling algorithm or cone cluster labeling algorithm on the basis of whether the radius of the hypersphere is greater than 1. Both theoretical analysis and experimental results (e.g., the first rank of a nonparametric statistical test) show the superiority of our method over the others, especially for large-scale data analysis under limited memory requirements.
引用
收藏
页码:281 / 310
页数:29
相关论文
共 50 条
  • [1] Fast and scalable support vector clustering for large-scale data analysis
    Ping, Yuan
    Chang, Yun Feng
    Zhou, Yajian
    Tian, Ying Jie
    Yang, Yi Xian
    Zhang, Zhili
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 43 (02) : 281 - 310
  • [2] Fast Support Vector Classification for Large-Scale Problems
    Akram-Ali-Hammouri, Ziad
    Fernandez-Delgado, Manuel
    Cernadas, Eva
    Barro, Senen
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 6184 - 6195
  • [3] Scalable Sequence Clustering for Large-Scale Immune Repertoire Analysis
    Bhusal, Prem
    Alam, A. K. M. Mubashwir
    Chen, Keke
    Jiang, Ning
    Xiao, Jun
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 1349 - 1358
  • [4] YADING: Fast Clustering of Large-Scale Time Series Data
    Ding, Rui
    Wang, Qiang
    Dang, Yingnong
    Fu, Qiang
    Zhang, Haidong
    Zhang, Dongmei
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2015, 8 (05): : 473 - 484
  • [5] Fast support-based clustering method for large-scale problems
    Jung, Kyu-Hwan
    Lee, Daewon
    Lee, Jaewook
    [J]. PATTERN RECOGNITION, 2010, 43 (05) : 1975 - 1983
  • [6] Fast Large-Scale Trajectory Clustering
    Wang, Sheng
    Bao, Zhifeng
    Culpepper, J. Shane
    Sellis, Timos
    Qin, Xiaolin
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 13 (01): : 29 - 42
  • [7] Support vector analysis of large-scale data based on kernels with iteratively increasing order
    Bo-Wei Chen
    Xinyu He
    Wen Ji
    Seungmin Rho
    Sun-Yuan Kung
    [J]. The Journal of Supercomputing, 2016, 72 : 3297 - 3311
  • [8] Support vector analysis of large-scale data based on kernels with iteratively increasing order
    Chen, Bo-Wei
    He, Xinyu
    Ji, Wen
    Rho, Seungmin
    Kung, Sun-Yuan
    [J]. JOURNAL OF SUPERCOMPUTING, 2016, 72 (09): : 3297 - 3311
  • [9] KNN-BLOCK DBSCAN: Fast Clustering for Large-Scale Data
    Chen, Yewang
    Zhou, Lida
    Pei, Songwen
    Yu, Zhiwen
    Chen, Yi
    Liu, Xin
    Du, Jixiang
    Xiong, Naixue
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2021, 51 (06): : 3939 - 3953
  • [10] Scalable k-means for large-scale clustering
    Ming, Yuewei
    Zhu, En
    Wang, Mao
    Liu, Qiang
    Liu, Xinwang
    Yin, Jianping
    [J]. INTELLIGENT DATA ANALYSIS, 2019, 23 (04) : 825 - 838