BEAR: Sketching BFGS Algorithm for Ultra-High Dimensional Feature Selection in Sublinear Memory

被引：0

作者：

Aghazadeh, Amirali ^{[1
]}

Gupta, Vipul ^{[1
]}

DeWeese, Alex ^{[1
]}

Koyluoglu, O. Ozan ^{[1
]}

Ramchandran, Kannan ^{[1
]}

机构：

[1] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA

来源：

MATHEMATICAL AND SCIENTIFIC MACHINE LEARNING, VOL 145 | 2021年 / 145卷

关键词：

Feature selection; sketching; second-order optimization; sublinear memory;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We consider feature selection for applications in machine learning where the dimensionality of the data is so large that it exceeds the working memory of the (local) computing machine. Unfortunately, current large-scale sketching algorithms show poor memory-accuracy trade-off in selecting features in high dimensions due to the irreversible collision and accumulation of the stochastic gradient noise in the sketched domain. Here, we develop a second-order feature selection algorithm, called BEAR, which avoids the extra collisions by efficiently storing the second-order stochastic gradients of the celebrated Broyden-Fletcher-Goldfarb-Shannon (BFGS) algorithm in Count Sketch, using a memory cost that grows sublinearly with the size of the feature vector. BEAR reveals an unexplored advantage of second-order optimization for memory-constrained high-dimensional gradient sketching. Our extensive experiments on several real-world data sets from genomics to language processing demonstrate that BEAR requires up to three orders of magnitude less memory space to achieve the same classification accuracy compared to the first-order sketching algorithms with a comparable run time. Our theoretical analysis further proves the global convergence of BEAR with O(1/t) rate in t iterations of the sketched algorithm.

引用

页码：75 / 92

页数：18

共 50 条

[1] Sequential Lasso Cum EBIC for Feature Selection With Ultra-High Dimensional Feature Space
Luo, Shan
Chen, Zehua
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2014, 109 (507) : 1229 - 1240
[2] Large-Scale Online Feature Selection for Ultra-High Dimensional Sparse Data
Wu, Yue
Hoi, Steven C. H.
Mei, Tao
Yu, Nenghai
[J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2017, 11 (04)
[3] Adjusted feature screening for ultra-high dimensional missing response
Zou, Liying
Liu, Yi
Zhang, Zhonghu
[J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2024, 94 (03) : 460 - 483
[4] Generalized Jaccard feature screening for ultra-high dimensional survival data
Liu, Renqing
Deng, Guangming
He, Hanji
[J]. AIMS MATHEMATICS, 2024, 9 (10): : 27607 - 27626
[5] Grouped feature screening for ultra-high dimensional data for the classification model
He, Hanji
Deng, Guangming
[J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2022, 92 (05) : 974 - 997
[6] Binary Representation of Polar Bear Algorithm for Feature Selection
Mirkhan, Amer
Celebi, Numan
[J]. COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2022, 43 (02): : 767 - 783
[7] HDSI: High dimensional selection with interactions algorithm on feature selection and testing
Jain, Rahi
Xu, Wei
[J]. PLOS ONE, 2021, 16 (02):
[8] Ultra-high dimensional variable selection for doubly robust causal inference
Tang, Dingke
Kong, Dehan
Pan, Wenliang
Wang, Linbo
[J]. BIOMETRICS, 2023, 79 (02) : 903 - 914
[9] Bayesian Multiresolution Variable Selection for Ultra-High Dimensional Neuroimaging Data
Zhao, Yize
Kang, Jian
Long, Qi
[J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018, 15 (02) : 537 - 550
[10] Forward variable selection for ultra-high dimensional quantile regression models
Honda, Toshio
Lin, Chien-Tong
[J]. ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2023, 75 (03) : 393 - 424

← 1 2 3 4 5 →