Iterative column subset selection

被引:7
|
作者
Ordozgoiti, Bruno [1 ]
Gomez Canaval, Sandra [1 ]
Mozo, Alberto [1 ]
机构
[1] Univ Politecn Madrid, Dept Comp Syst, Madrid, Spain
基金
欧盟地平线“2020”;
关键词
Column subset selection; Unsupervised feature selection; Dimensionality reduction; Machine learning; Data mining; UNSUPERVISED FEATURE-SELECTION; FACE RECOGNITION; RANK; DECOMPOSITION; RELEVANCE;
D O I
10.1007/s10115-017-1115-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dimensionality reduction is often a crucial step for the successful application of machine learning and data mining methods. One way to achieve said reduction is feature selection. Due to the impossibility of labelling many data sets, unsupervised approaches are frequently the only option. The column subset selection problem translates naturally to this purpose and has received considerable attention over the last few years, as it provides simple linear models for low-rank data reconstruction. Recently, it was empirically shown that an iterative algorithm, which can be implemented efficiently, provides better subsets than other state-of-the-art methods. In this paper, we describe this algorithm and provide a more in-depth analysis. We carry out numerous experiments to gain insights on its behaviour and derive a simple bound for the norm recovered by the resulting matrix. To the best of our knowledge, this is the first theoretical result of this kind for this algorithm.
引用
收藏
页码:65 / 94
页数:30
相关论文
共 50 条
  • [41] Label Selection Algorithm Based on Iteration Column Subset Selection for Multi-label Classification
    Peng, Tao
    Li, Jun
    Xu, Jianhua
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2022, PT I, 2022, 13426 : 287 - 301
  • [42] A Fast Satellite Selection Algorithm Based on Hierarchical Clustering and Iterative Subset Optimization
    Jing, Dan
    Li, Weidie
    Han, Liang
    Li, Xinchen
    Li, Liangchao
    Zhang, Yan
    Guo, Liang
    Xing, Mengdao
    REMOTE SENSING, 2025, 17 (05)
  • [43] Provably Correct Algorithms for Matrix Column Subset Selection with Selectively Sampled Data
    Wang, Yining
    Singh, Aarti
    JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 18 : 1 - 42
  • [44] Deterministic column subset selection for single-cell RNA-Seq
    McCurdy, Shannon R.
    Ntranos, Vasilis
    Pachter, Lior
    PLOS ONE, 2019, 14 (01):
  • [45] LOW-RANK APPROXIMATION IN THE FROBENIUS NORM BY COLUMN AND ROW SUBSET SELECTION
    Cortinovis, Alice
    Kressner, Daniel
    SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2020, 41 (04) : 1651 - 1673
  • [46] Equivalence between Graph Spectral Clustering and Column Subset Selection (Student Abstract)
    Wan, Guihong
    Mao, Wei
    Semenov, Yevgeniy R.
    Schweitzer, Haim
    THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23673 - 23675
  • [47] Graph Clustering Methods Derived from Column Subset Selection (Student Abstract)
    Mao, Wei
    Wan, Guihong
    Schweitzer, Haim
    THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23573 - 23575
  • [48] Average Case Column Subset Selection for Entrywise l1-Norm Loss
    Song, Zhao
    Woodruff, David P.
    Zhong, Peilin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [49] Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nystrom method
    Derezinski, Michal
    Khanna, Rajiv
    Mahoney, Michael W.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [50] A data mining-based subset selection for enhanced discrimination using iterative elimination of redundancy
    Cho, Hyun-Woo
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) : 1355 - 1361