BEAR: Sketching BFGS Algorithm for Ultra-High Dimensional Feature Selection in Sublinear Memory

被引:0
|
作者
Aghazadeh, Amirali [1 ]
Gupta, Vipul [1 ]
DeWeese, Alex [1 ]
Koyluoglu, O. Ozan [1 ]
Ramchandran, Kannan [1 ]
机构
[1] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
关键词
Feature selection; sketching; second-order optimization; sublinear memory;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider feature selection for applications in machine learning where the dimensionality of the data is so large that it exceeds the working memory of the (local) computing machine. Unfortunately, current large-scale sketching algorithms show poor memory-accuracy trade-off in selecting features in high dimensions due to the irreversible collision and accumulation of the stochastic gradient noise in the sketched domain. Here, we develop a second-order feature selection algorithm, called BEAR, which avoids the extra collisions by efficiently storing the second-order stochastic gradients of the celebrated Broyden-Fletcher-Goldfarb-Shannon (BFGS) algorithm in Count Sketch, using a memory cost that grows sublinearly with the size of the feature vector. BEAR reveals an unexplored advantage of second-order optimization for memory-constrained high-dimensional gradient sketching. Our extensive experiments on several real-world data sets from genomics to language processing demonstrate that BEAR requires up to three orders of magnitude less memory space to achieve the same classification accuracy compared to the first-order sketching algorithms with a comparable run time. Our theoretical analysis further proves the global convergence of BEAR with O(1/t) rate in t iterations of the sketched algorithm.
引用
收藏
页码:75 / 92
页数:18
相关论文
共 50 条
  • [1] Sequential Lasso Cum EBIC for Feature Selection With Ultra-High Dimensional Feature Space
    Luo, Shan
    Chen, Zehua
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2014, 109 (507) : 1229 - 1240
  • [2] Large-Scale Online Feature Selection for Ultra-High Dimensional Sparse Data
    Wu, Yue
    Hoi, Steven C. H.
    Mei, Tao
    Yu, Nenghai
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2017, 11 (04)
  • [3] Adjusted feature screening for ultra-high dimensional missing response
    Zou, Liying
    Liu, Yi
    Zhang, Zhonghu
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2024, 94 (03) : 460 - 483
  • [4] Generalized Jaccard feature screening for ultra-high dimensional survival data
    Liu, Renqing
    Deng, Guangming
    He, Hanji
    [J]. AIMS MATHEMATICS, 2024, 9 (10): : 27607 - 27626
  • [5] Grouped feature screening for ultra-high dimensional data for the classification model
    He, Hanji
    Deng, Guangming
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2022, 92 (05) : 974 - 997
  • [6] Binary Representation of Polar Bear Algorithm for Feature Selection
    Mirkhan, Amer
    Celebi, Numan
    [J]. COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2022, 43 (02): : 767 - 783
  • [7] HDSI: High dimensional selection with interactions algorithm on feature selection and testing
    Jain, Rahi
    Xu, Wei
    [J]. PLOS ONE, 2021, 16 (02):
  • [8] Ultra-high dimensional variable selection for doubly robust causal inference
    Tang, Dingke
    Kong, Dehan
    Pan, Wenliang
    Wang, Linbo
    [J]. BIOMETRICS, 2023, 79 (02) : 903 - 914
  • [9] Bayesian Multiresolution Variable Selection for Ultra-High Dimensional Neuroimaging Data
    Zhao, Yize
    Kang, Jian
    Long, Qi
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018, 15 (02) : 537 - 550
  • [10] Forward variable selection for ultra-high dimensional quantile regression models
    Honda, Toshio
    Lin, Chien-Tong
    [J]. ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2023, 75 (03) : 393 - 424