Improving cross-validated bandwidth selection using subsampling-extrapolation techniques

被引:6
|
作者
Wang, Qing [1 ]
Lindsay, Bruce G. [2 ]
机构
[1] Williams Coll, Dept Math & Stat, Williamstown, MA 01267 USA
[2] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
基金
美国国家科学基金会;
关键词
Bandwidth selection; Cross-validation; Extrapolation; L-2; distance; Nonparametric kernel density estimator; Subsampling; DENSITY-ESTIMATION; MODEL SELECTION;
D O I
10.1016/j.csda.2015.03.005
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Cross-validation methodologies have been widely used as a means of selecting tuning parameters in nonparametric statistical problems. In this paper we focus on a new method for improving the reliability of cross-validation. We implement this method in the context of the kernel density estimator, where one needs to select the bandwidth parameter so as to minimize L-2 risk. This method is a two-stage subsampling-extrapolation bandwidth selection procedure, which is realized by first evaluating the risk at a fictional sample size m (m <= sample size n) and then extrapolating the optimal bandwidth from m to n. This two-stage method can dramatically reduce the variability of the conventional unbiased cross-validation bandwidth selector. This simple first-order extrapolation estimator is equivalent to the rescaled "bagging-CV" bandwidth selector in Hall and Robinson (2009) if one sets the bootstrap size equal to the fictional sample size. However, our simplified expression for the risk estimator enables us to compute the aggregated risk without any bootstrapping. Furthermore, we developed a second-order extrapolation technique as an extension designed to improve the approximation of the true optimal bandwidth. To select the optimal choice of the Fictional size m given a sample of size n, we propose a nested cross-validation methodology. Based on simulation study, the proposed new methods show promising performance across a wide selection of distributions. In addition, we also investigated the asymptotic properties of the proposed bandwidth selectors. (C) 2015 The Authors. Published by Elsevier B.V.
引用
收藏
页码:51 / 71
页数:21
相关论文
共 50 条
  • [1] Subsampling-extrapolation bandwidth selection in bivariate kernel density estimation
    Wang, Qing
    Zambom, Adriano Z.
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2019, 89 (09) : 1740 - 1759
  • [2] CROSS-VALIDATED BANDWIDTH SELECTION FOR PRECISION MATRIX ESTIMATION
    Tong, Jun
    Xi, Jiangtao
    Yu, Yanguang
    Ogunbona, Philip O.
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4479 - 4483
  • [3] Model selection for probabilistic clustering using cross-validated likelihood
    Padhraic Smyth
    Statistics and Computing, 2000, 10 : 63 - 72
  • [4] Model selection for probabilistic clustering using cross-validated likelihood
    Smyth, P
    STATISTICS AND COMPUTING, 2000, 10 (01) : 63 - 72
  • [5] Cross-validated structure selection for neural networks
    Schenker, B
    Agarwal, M
    COMPUTERS & CHEMICAL ENGINEERING, 1996, 20 (02) : 175 - 186
  • [6] Cross-validated structure selection for neural networks
    TCL, Zurich, Switzerland
    Computers and Chemical Engineering, 1996, 20 (02): : 175 - 186
  • [7] Cross-validated mixed-datatype bandwidth selection for nonparametric cumulative distribution/survivor functions
    Li, Cong
    Li, Hongjun
    Racine, Jeffrey S.
    ECONOMETRIC REVIEWS, 2017, 36 (6-9) : 970 - 987
  • [8] COMB: A Hybrid Method for Cross-validated Feature Selection
    Thejas, G. S.
    Jimenez, Daniel
    Iyengar, S. S.
    Miller, Jerry
    Sunitha, N. R.
    Badrinath, Prajwal
    ACMSE 2020: PROCEEDINGS OF THE 2020 ACM SOUTHEAST CONFERENCE, 2020, : 100 - 106
  • [9] Estimating ecosystem risks using cross-validated multiple regression and cross-validated holographic neural networks
    Findlay, CS
    Zheng, LG
    ECOLOGICAL MODELLING, 1999, 119 (01) : 57 - 72
  • [10] Stochastic design using cross-validated nonparametric meta-modelling techniques
    Tu, J
    Cheng, YP
    INTERNATIONAL JOURNAL OF MATERIALS & PRODUCT TECHNOLOGY, 2006, 25 (1-3): : 182 - 197