Cross-validated bagged learning

被引:10
|
作者
Petersena, Maya L.
Molinaro, Annette M.
Sinisi, Sandra E.
van der Laan, Mark J.
机构
[1] Univ Calif Berkeley, Sch Publ Hlth, Div Biostat, Berkeley, CA 94720 USA
[2] Yale Univ, Sch Publ Hlth, New Haven, CT 06520 USA
基金
美国国家卫生研究院;
关键词
bootstrap aggregation; data-adaptive regression; resistant HIV; deletion/substitution/addition algorithm;
D O I
10.1016/j.jmva.2007.07.004
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Many applications aim to learn a high dimensional parameter of a data generating distribution based on a sample of independent and identically distributed observations. For example, the goal might be to estimate the conditional mean of an outcome given a list of input variables. In this prediction context, bootstrap aggregating (bagging) has been introduced as a method to reduce the variance of a given estimator at little cost to bias. Bagging involves applying an estimator to multiple bootstrap samples and averaging the result across bootstrap samples. In order to address the curse of dimensionality, a common practice has been to apply bagging to estimators which themselves use cross-validation, thereby using cross-validation within a bootstrap sample to select fine-tuning parameters trading off bias and variance of the bootstrap sample-specific candidate estimators. In this article we point out that in order to achieve the correct bias variance trade-off for the parameter of interest, one should apply the cross-validation selector externally to candidate bagged estimators indexed by these fine-tuning parameters. We use three simulations to compare,the new cross-validated bagging method with bagging of cross-validated estimators and bagging of noncross-validated estimators. (c) 2007 Elsevier Inc. All rights reserved.
引用
收藏
页码:1693 / 1704
页数:12
相关论文
共 50 条
  • [1] Cross-validated bagged prediction of survival
    Sinisi, Sandra E.
    Neugebauer, Romain
    van der Laan, Mark J.
    [J]. STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2006, 5
  • [2] Cross-Validated Tomography
    Mogilevtsev, D.
    Hradil, Z.
    Rehacek, J.
    Shchesnovich, V. S.
    [J]. PHYSICAL REVIEW LETTERS, 2013, 111 (12)
  • [3] Cross-Validated Smooth Multi-Instance Learning
    Li, Dayuan
    Zhu, Lin
    Bao, Wenzheng
    Cheng, Fei
    Ren, Yi
    Huang, De-Shuang
    [J]. 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 1321 - 1325
  • [4] Cross-validated wavelet shrinkage
    Oh, Hee-Seok
    Kim, Donghoh
    Lee, Youngjo
    [J]. COMPUTATIONAL STATISTICS, 2009, 24 (03) : 497 - 512
  • [5] Cross-validated wavelet shrinkage
    Hee-Seok Oh
    Donghoh Kim
    Youngjo Lee
    [J]. Computational Statistics, 2009, 24 : 497 - 512
  • [6] Estimating ecosystem risks using cross-validated multiple regression and cross-validated holographic neural networks
    Findlay, CS
    Zheng, LG
    [J]. ECOLOGICAL MODELLING, 1999, 119 (01) : 57 - 72
  • [7] Prequential and cross-validated regression estimation
    Modha, DS
    Masry, E
    [J]. MACHINE LEARNING, 1998, 33 (01) : 5 - 39
  • [8] The Cross-Validated Adaptive Signature Design
    Freidlin, Boris
    Jiang, Wenyu
    Simon, Richard
    [J]. CLINICAL CANCER RESEARCH, 2010, 16 (02) : 691 - 698
  • [9] ON CROSS-VALIDATED LASSO IN HIGH DIMENSIONS
    Chetverikov, Denis
    Liao, Zhipeng
    Chernozhukov, Victor
    [J]. ANNALS OF STATISTICS, 2021, 49 (03): : 1300 - 1317
  • [10] Prequential and Cross-Validated Regression Estimation
    Dharmendra S. Modha
    Elias Masry
    [J]. Machine Learning, 1998, 33 : 5 - 39