Revisiting the Nystrom Method for Improved Large-scale Machine Learning

被引:0
|
作者
Gittens, Alex [1 ]
Mahoney, Michael W.
机构
[1] Univ Calif Berkeley, Int Comp Sci Inst, Berkeley, CA 94720 USA
关键词
Nystrom approximation; low-rank approximation; kernel methods; randomized algorithms; numerical linear algebra; RANDOMIZED ALGORITHM; MATRIX; IDENTIFICATION; APPROXIMATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We reconsider randomized algorithms for the low-rank approximation of symmetric positive semi-definite (SPSD) matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications. Our main results consist of an empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices. Our results highlight complementary aspects of sampling versus projection methods; they characterize the effects of common data preprocessing steps on the performance of these algorithms; and they point to important differences between uniform sampling and nonuniform sampling methods based on leverage scores. In addition, our empirical results illustrate that existing theory is so weak that it does not provide even a qualitative guide to practice. Thus, we complement our empirical results with a suite of worst-case theoretical bounds for both random sampling and random projection methods. These bounds are qualitatively superior to existing bounds-e.g., improved additive-error bounds for spectral and Frobenius norm error and relative-error bounds for trace norm error-and they point to future directions to make these algorithms useful in even larger-scale machine learning applications.
引用
收藏
页数:65
相关论文
共 50 条
  • [1] A review of Nystrom methods for large-scale machine learning
    Sun, Shiliang
    Zhao, Jing
    Zhu, Jiang
    [J]. INFORMATION FUSION, 2015, 26 : 36 - 48
  • [2] The Variational Nystrom Method for Large-Scale Spectral Problems
    Vladymyrov, Max
    Carreira-Perpinan, Miguel A.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [3] Double Nystrom Method: An Efficient and Accurate Nystrom Scheme for Large-Scale Data Sets
    Lim, Woosang
    Kim, Minhwan
    Park, Haesun
    Jung, Kyomin
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 1367 - 1375
  • [4] Improved Powered Stochastic Optimization Algorithms for Large-Scale Machine Learning
    Yang, Zhuang
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [5] A Survey on Large-Scale Machine Learning
    Wang, Meng
    Fu, Weijie
    He, Xiangnan
    Hao, Shijie
    Wu, Xindong
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (06) : 2574 - 2594
  • [6] Large-scale data classification method based on machine learning model
    Department of Electrical Engineering, Dalian Institute of Science and Technology, Dalian, China
    [J]. Int. J. Database Theory Appl., 2 (71-80):
  • [7] Clustered Nystrom Method for Large Scale Manifold Learning and Dimension Reduction
    Zhang, Kai
    Kwok, James T.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2010, 21 (10): : 1576 - 1587
  • [8] Efficient Machine Learning On Large-Scale Graphs
    Erickson, Parker
    Lee, Victor E.
    Shi, Feng
    Tang, Jiliang
    [J]. PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4788 - 4789
  • [9] Large-scale kernel extreme learning machine
    Deng, Wan-Yu
    Zheng, Qing-Hua
    Chen, Lin
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2014, 37 (11): : 2235 - 2246
  • [10] Machine learning for large-scale MOF screening
    Coupry, Damien
    Groot, Laurens
    Addicoat, Matthew
    Heine, Thomas
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2017, 253