Revisiting the Nystrom Method for Improved Large-scale Machine Learning

被引：0

作者：

Gittens, Alex ^{[1
]}

Mahoney, Michael W.

机构：

[1] Univ Calif Berkeley, Int Comp Sci Inst, Berkeley, CA 94720 USA

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2016年 / 17卷

关键词：

Nystrom approximation; low-rank approximation; kernel methods; randomized algorithms; numerical linear algebra; RANDOMIZED ALGORITHM; MATRIX; IDENTIFICATION; APPROXIMATION;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We reconsider randomized algorithms for the low-rank approximation of symmetric positive semi-definite (SPSD) matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications. Our main results consist of an empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices. Our results highlight complementary aspects of sampling versus projection methods; they characterize the effects of common data preprocessing steps on the performance of these algorithms; and they point to important differences between uniform sampling and nonuniform sampling methods based on leverage scores. In addition, our empirical results illustrate that existing theory is so weak that it does not provide even a qualitative guide to practice. Thus, we complement our empirical results with a suite of worst-case theoretical bounds for both random sampling and random projection methods. These bounds are qualitatively superior to existing bounds-e.g., improved additive-error bounds for spectral and Frobenius norm error and relative-error bounds for trace norm error-and they point to future directions to make these algorithms useful in even larger-scale machine learning applications.

引用

页数：65

共 50 条

[1] A review of Nystrom methods for large-scale machine learning
Sun, Shiliang
Zhao, Jing
Zhu, Jiang
[J]. INFORMATION FUSION, 2015, 26 : 36 - 48
[2] The Variational Nystrom Method for Large-Scale Spectral Problems
Vladymyrov, Max
Carreira-Perpinan, Miguel A.
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
[3] Double Nystrom Method: An Efficient and Accurate Nystrom Scheme for Large-Scale Data Sets
Lim, Woosang
Kim, Minhwan
Park, Haesun
Jung, Kyomin
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 1367 - 1375
[4] Improved Powered Stochastic Optimization Algorithms for Large-Scale Machine Learning
Yang, Zhuang
[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
[5] A Survey on Large-Scale Machine Learning
Wang, Meng
Fu, Weijie
He, Xiangnan
Hao, Shijie
Wu, Xindong
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (06) : 2574 - 2594
[6] Large-scale data classification method based on machine learning model
Department of Electrical Engineering, Dalian Institute of Science and Technology, Dalian, China
[J]. Int. J. Database Theory Appl., 2 (71-80):
[7] Clustered Nystrom Method for Large Scale Manifold Learning and Dimension Reduction
Zhang, Kai
Kwok, James T.
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2010, 21 (10): : 1576 - 1587
[8] Efficient Machine Learning On Large-Scale Graphs
Erickson, Parker
Lee, Victor E.
Shi, Feng
Tang, Jiliang
[J]. PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4788 - 4789
[9] Large-scale kernel extreme learning machine
Deng, Wan-Yu
Zheng, Qing-Hua
Chen, Lin
[J]. Jisuanji Xuebao/Chinese Journal of Computers, 2014, 37 (11): : 2235 - 2246
[10] Machine learning for large-scale MOF screening
Coupry, Damien
Groot, Laurens
Addicoat, Matthew
Heine, Thomas
[J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2017, 253

← 1 2 3 4 5 →