Joint image clustering and feature selection with auto-adjoined learning for high-dimensional data

被引:11
|
作者
Wang, Xiaodong [1 ]
Wu, Pengtao [1 ]
Xu, Qinghua [1 ]
Zeng, Zhiqiang [1 ]
Xie, Yong [2 ]
机构
[1] Xiamen Univ Technol, Coll Comp & Informat Engn, Xiamen, Peoples R China
[2] Nanjing Univ Posts & Telecommun, Sch Comp Sci, Nanjing, Peoples R China
基金
中国国家自然科学基金;
关键词
K-means; Dimension reduction; Feature selection; Clustering; K-MEANS;
D O I
10.1016/j.knosys.2021.107443
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the rapid development of modern multimedia techniques, high-dimensional image data are frequently encountered in many image analysis communities, such as clustering and feature learning. K-means (KM) is one of the widely-used and efficient tools for clustering high-dimensional data. However, as the commonly contained irrelevant features or noise, conventional KMs suffer from degraded performance for high-dimensional data. Recent studies try to overcome this problem by combining KMs with subspace learning. Nevertheless, they usually depend on complex eigenvalue decomposition, which needs expensive computation resources. Besides, their clustering models also ignore the local manifold structure among data, failing to utilize the underlying adjacent information. Two points are critical for clustering high-dimensional image data: efficient feature selecting and clear adjacency exploring. Based on the above consideration, we propose an auto-adjoined subspace clustering. Concretely, to efficiently locate the redundant features, we impose an extremely sparse feature selection matrix into KM, which is easy to be optimized. Besides, to accurately encode the local adjacency among data without the influence of noise, we propose to automatically assign the connectivity of each sample in the low-dimensional feature space. Compared with several state-of-the-art clustering methods, the proposed method constantly improves the clustering performance on six publicly available benchmark image datasets, demonstrating the effectiveness of our method. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Clustering high-dimensional data via feature selection
    Liu, Tianqi
    Lu, Yu
    Zhu, Biqing
    Zhao, Hongyu
    [J]. BIOMETRICS, 2023, 79 (02) : 940 - 950
  • [2] On online high-dimensional spherical data clustering and feature selection
    Amayri, Ola
    Bouguila, Nizar
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (04) : 1386 - 1398
  • [3] Latent Feature Group Learning for High-Dimensional Data Clustering
    Wang, Wenting
    He, Yulin
    Ma, Liheng
    Huang, Joshua Zhexue
    [J]. INFORMATION, 2019, 10 (06):
  • [4] A GA-based Feature Selection for High-dimensional Data Clustering
    Sun, Mei
    Xiong, Langhuan
    Sun, Haojun
    Jiang, Dazhi
    [J]. THIRD INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTING, 2009, : 769 - 772
  • [5] Feature selection for high-dimensional data
    Bolón-Canedo V.
    Sánchez-Maroño N.
    Alonso-Betanzos A.
    [J]. Progress in Artificial Intelligence, 2016, 5 (2) : 65 - 75
  • [6] Feature selection for high-dimensional data
    Destrero A.
    Mosci S.
    De Mol C.
    Verri A.
    Odone F.
    [J]. Computational Management Science, 2009, 6 (1) : 25 - 40
  • [7] A density-based clustering algorithm for high-dimensional data with feature selection
    Qi Xianting
    Wang Pan
    [J]. 2016 2ND INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS - COMPUTING TECHNOLOGY, INTELLIGENT TECHNOLOGY, INDUSTRIAL INFORMATION INTEGRATION (ICIICII), 2016, : 114 - 118
  • [8] Differential Privacy High-Dimensional Data Publishing Based on Feature Selection and Clustering
    Chu, Zhiguang
    He, Jingsha
    Zhang, Xiaolei
    Zhang, Xing
    Zhu, Nafei
    [J]. ELECTRONICS, 2023, 12 (09)
  • [9] Subspace selection for clustering high-dimensional data
    Baumgartner, C
    Plant, C
    Kailing, K
    Kriegel, HP
    Kröger, P
    [J]. FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2004, : 11 - 18
  • [10] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
    Verleysen, Michel
    [J]. NCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON NEURAL COMPUTATION THEORY AND APPLICATIONS, 2011, : IS23 - IS25