Improving cross-validated bandwidth selection using subsampling-extrapolation techniques

被引:6
|
作者
Wang, Qing [1 ]
Lindsay, Bruce G. [2 ]
机构
[1] Williams Coll, Dept Math & Stat, Williamstown, MA 01267 USA
[2] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
基金
美国国家科学基金会;
关键词
Bandwidth selection; Cross-validation; Extrapolation; L-2; distance; Nonparametric kernel density estimator; Subsampling; DENSITY-ESTIMATION; MODEL SELECTION;
D O I
10.1016/j.csda.2015.03.005
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Cross-validation methodologies have been widely used as a means of selecting tuning parameters in nonparametric statistical problems. In this paper we focus on a new method for improving the reliability of cross-validation. We implement this method in the context of the kernel density estimator, where one needs to select the bandwidth parameter so as to minimize L-2 risk. This method is a two-stage subsampling-extrapolation bandwidth selection procedure, which is realized by first evaluating the risk at a fictional sample size m (m <= sample size n) and then extrapolating the optimal bandwidth from m to n. This two-stage method can dramatically reduce the variability of the conventional unbiased cross-validation bandwidth selector. This simple first-order extrapolation estimator is equivalent to the rescaled "bagging-CV" bandwidth selector in Hall and Robinson (2009) if one sets the bootstrap size equal to the fictional sample size. However, our simplified expression for the risk estimator enables us to compute the aggregated risk without any bootstrapping. Furthermore, we developed a second-order extrapolation technique as an extension designed to improve the approximation of the true optimal bandwidth. To select the optimal choice of the Fictional size m given a sample of size n, we propose a nested cross-validation methodology. Based on simulation study, the proposed new methods show promising performance across a wide selection of distributions. In addition, we also investigated the asymptotic properties of the proposed bandwidth selectors. (C) 2015 The Authors. Published by Elsevier B.V.
引用
收藏
页码:51 / 71
页数:21
相关论文
共 50 条
  • [31] A Cross-Validated Feature Selection (CVFS) approach for extracting the most parsimonious feature sets and discovering potential antimicrobial resistance (AMR) biomarkers
    Yang, Ming-Ren
    Wu, Yu-Wei
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2023, 21 : 769 - 779
  • [32] Assessing the Hazard of Deep-Seated Rock Slope Instability through the Description of Potential Failure Scenarios, Cross-Validated Using Several Remote Sensing and Monitoring Techniques
    Wolff, Charlotte
    Jaboyedoff, Michel
    Fei, Li
    Pedrazzini, Andrea
    Derron, Marc-Henri
    Rivolta, Carlo
    Merrien-Soukatchoff, Veronique
    REMOTE SENSING, 2023, 15 (22)
  • [33] THE DIMENSIONS OF HEALTH-STATUS IN CHRONIC DISEASE - A CROSS-VALIDATED ASSESSMENT OF OUTCOME USING RHEUMATOID-ARTHRITIS AS A MODEL
    KAZIS, LE
    BROWN, JH
    SPITZ, PW
    GERTMAN, P
    FRIES, JF
    MEENAN, RF
    CLINICAL RESEARCH, 1982, 30 (02): : A301 - A301
  • [34] Comment: Bayesian checking of the second level of hierarchical models: Cross-validated posterior predictive checks using discrepancy measures
    Larsen, Michael D.
    Lu, Lu
    STATISTICAL SCIENCE, 2007, 22 (03) : 359 - 362
  • [35] CROSS-VALIDATED R2 GUIDED REGION SELECTION IN COMPARATIVE MOLECULAR-FIELD ANALYSIS (COMFA) - A SIMPLE METHOD TO ACHIEVE CONSISTENCY IN THE RESULTS
    CHO, SJ
    TROPSHA, A
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1994, 208 : 76 - COMP
  • [36] Improving clustering performance by using feature selection and extraction techniques
    Shihab, Khalil
    Journal of Intelligent Systems, 2004, 13 (03) : 249 - 273
  • [37] Improving the convergence of the iterative solution of matrix equations in the method of moments formulation using extrapolation techniques
    Ma, J
    Mittra, R
    Huang, N
    IEE PROCEEDINGS-MICROWAVES ANTENNAS AND PROPAGATION, 2003, 150 (04) : 253 - 257
  • [38] Optic Nerve Tolerance Dose Prediction by using cross-validated machine Learning: Quantitative Analysis of Data after Brachytherapy of Choroidal Melanoma
    Guberina, M.
    Sokolenko, E.
    Sauerwein, W.
    Bornfeld, N.
    Guberina, N.
    Rating, P.
    Fluehs, D.
    Bechrakis, N.
    Stuschke, M.
    STRAHLENTHERAPIE UND ONKOLOGIE, 2022, 198 (SUPPL 1) : S21 - S21
  • [39] On improving the performance of spam filters using heuristic feature selection techniques
    Wang, Ren
    Youssef, Amr M.
    Elhakeem, Ahmed K.
    2006 23RD BIENNIAL SYMPOSIUM ON COMMUNICATIONS, 2006, : 227 - +
  • [40] Reconstructing Paleo-oxygenation for the Last 54,000 Years in the Gulf of Alaska Using Cross-validated Benthic Foraminiferal and Geochemical Records
    Sharon
    Belanger, Christina
    Du, Jianghui
    Mix, Alan
    PALEOCEANOGRAPHY AND PALEOCLIMATOLOGY, 2021, 36 (02)