Optimal Tuning for Divide-and-conquer Kernel Ridge Regression with Massive Data

被引:0
|
作者
Xu, Ganggang [1 ]
Shang, Zuofeng [2 ]
Cheng, Guang [3 ]
机构
[1] SUNY Binghamton, Dept Math Sci, Binghamton, NY 13902 USA
[2] IUPUI, Dept Math Sci, Indianapolis, IN USA
[3] Purdue Univ, Dept Stat, W Lafayette, IN 47907 USA
关键词
ASYMPTOTIC OPTIMALITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Divide-and-conquer is a powerful approach for large and massive data analysis. In the nonparameteric regression setting, although various theoretical frameworks have been established to achieve optimality in estimation or hypothesis testing, how to choose the tuning parameter in a practically effective way is still an open problem. In this paper, we propose a data-driven procedure based on divide-and-conquer for selecting the tuning parameters in kernel ridge regression by modifying the popular Generalized Cross-validation (GCV, Wahba, 1990). While the proposed criterion is computationally scalable for massive data sets, it is also shown under mild conditions to be asymptotically optimal in the sense that minimizing the proposed distributed-GCV (dGCV) criterion is equivalent to minimizing the true global conditional empirical loss of the averaged function estimator, extending the existing optimality results of GCV to the divide-and-conquer framework.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] A Sharper Generalization Bound for Divide-and-Conquer Ridge Regression
    Wang, Shusen
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 5305 - 5312
  • [2] Divide and conquer kernel ridge regression: A distributed algorithm with minimax optimal rates
    Zhang, Yuchen
    Duchi, John
    Wainwright, Martin
    Journal of Machine Learning Research, 2015, 16 : 3299 - 3340
  • [3] Divide and Conquer Kernel Ridge Regression: A Distributed Algorithm with Minimax Optimal Rates
    Zhang, Yuchen
    Duchi, John
    Wainwright, Martin
    JOURNAL OF MACHINE LEARNING RESEARCH, 2015, 16 : 3299 - 3340
  • [4] Divide and conquer kernel quantile regression for massive dataset
    Bang, Sungwan
    Kim, Jaeoh
    KOREAN JOURNAL OF APPLIED STATISTICS, 2020, 33 (05) : 569 - 578
  • [5] Distributed Generalized Cross-Validation for Divide-and-Conquer Kernel Ridge Regression and Its Asymptotic Optimality
    Xu, Ganggang
    Shang, Zuofeng
    Cheng, Guang
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2019, 28 (04) : 891 - 908
  • [6] APPROXIMATIONS AND OPTIMAL GEOMETRIC DIVIDE-AND-CONQUER
    MATOUSEK, J
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1995, 50 (02) : 203 - 208
  • [7] A fast divide-and-conquer sparse Cox regression
    Wang, Yan
    Hong, Chuan
    Palmer, Nathan
    Di, Qian
    Schwartz, Joel
    Kohane, Isaac
    Cai, Tianxi
    BIOSTATISTICS, 2021, 22 (02) : 381 - 401
  • [8] A Divide-and-Conquer Solver for Kernel Support Vector Machines
    Hsieh, Cho-Jui
    Si, Si
    Dhillon, Inderjit S.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 1), 2014, 32
  • [9] Massive parallelization of divide-and-conquer algorithms over powerlists
    Achatz, K
    Schulte, W
    SCIENCE OF COMPUTER PROGRAMMING, 1996, 26 (1-3) : 59 - 78
  • [10] Divide-and-Conquer Learning with Nystrom: Optimal Rate and Algorithm
    Yin, Rong
    Liu, Yong
    Lu, Lijing
    Wang, Weiping
    Meng, Dan
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6696 - 6703