Improving cross-validated bandwidth selection using subsampling-extrapolation techniques

被引:6
|
作者
Wang, Qing [1 ]
Lindsay, Bruce G. [2 ]
机构
[1] Williams Coll, Dept Math & Stat, Williamstown, MA 01267 USA
[2] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
基金
美国国家科学基金会;
关键词
Bandwidth selection; Cross-validation; Extrapolation; L-2; distance; Nonparametric kernel density estimator; Subsampling; DENSITY-ESTIMATION; MODEL SELECTION;
D O I
10.1016/j.csda.2015.03.005
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Cross-validation methodologies have been widely used as a means of selecting tuning parameters in nonparametric statistical problems. In this paper we focus on a new method for improving the reliability of cross-validation. We implement this method in the context of the kernel density estimator, where one needs to select the bandwidth parameter so as to minimize L-2 risk. This method is a two-stage subsampling-extrapolation bandwidth selection procedure, which is realized by first evaluating the risk at a fictional sample size m (m <= sample size n) and then extrapolating the optimal bandwidth from m to n. This two-stage method can dramatically reduce the variability of the conventional unbiased cross-validation bandwidth selector. This simple first-order extrapolation estimator is equivalent to the rescaled "bagging-CV" bandwidth selector in Hall and Robinson (2009) if one sets the bootstrap size equal to the fictional sample size. However, our simplified expression for the risk estimator enables us to compute the aggregated risk without any bootstrapping. Furthermore, we developed a second-order extrapolation technique as an extension designed to improve the approximation of the true optimal bandwidth. To select the optimal choice of the Fictional size m given a sample of size n, we propose a nested cross-validation methodology. Based on simulation study, the proposed new methods show promising performance across a wide selection of distributions. In addition, we also investigated the asymptotic properties of the proposed bandwidth selectors. (C) 2015 The Authors. Published by Elsevier B.V.
引用
收藏
页码:51 / 71
页数:21
相关论文
共 50 条
  • [41] Multiple conformer protocol: A new method for the identification of preferred ligand binding motifs using cross-validated 3D-QSAR models
    Doweyko, AM
    RATIONAL APPROACHES TO DRUG DESIGN, 2001, : 307 - 315
  • [42] Support Vector Machines (SVM) classification of prostate cancer Gleason score in central gland using multiparametric magnetic resonance images: A cross-validated study
    Li, Jiance
    Weng, Zhiliang
    Xu, Huazhi
    Zhang, Zhao
    Miao, Haiwei
    Chen, Wei
    Liu, Zheng
    Zhang, Xiaoqin
    Wang, Meihao
    Xu, Xiao
    Ye, Qiong
    EUROPEAN JOURNAL OF RADIOLOGY, 2018, 98 : 61 - 67
  • [43] Three-dimensional quantitative structure-activity relationship study of nonsteroidal estrogen receptor ligands using the comparative molecular field analysis cross-validated r2-guided region selection approach
    Sadler, BR
    Cho, SJ
    Ishaq, KS
    Chae, K
    Korach, KS
    JOURNAL OF MEDICINAL CHEMISTRY, 1998, 41 (13) : 2261 - 2267
  • [44] Improving the bandwidth of the transimpedance amplifier based on CS stages in cascode configuration using impedance matching techniques
    Abu-Taha, Jawdat Y.
    Yazgi, Metin
    ANALOG INTEGRATED CIRCUITS AND SIGNAL PROCESSING, 2016, 89 (03) : 685 - 691
  • [45] Cross-validated methods for promoter/transcription start site mapping in SL trans-spliced genes, established using the Ciona intestinalis troponin I gene
    Khare, Parul
    Mortimer, Sandra I.
    Cleto, Cynthia L.
    Okamura, Kohji
    Suzuki, Yutaka
    Kusakabe, Takehiro
    Nakai, Kenta
    Meedel, Thomas H.
    Hastings, Kenneth E. M.
    NUCLEIC ACIDS RESEARCH, 2011, 39 (07) : 2638 - 2648
  • [46] Improving the bandwidth of the transimpedance amplifier based on CS stages in cascode configuration using impedance matching techniques
    Jawdat Y. Abu-Taha
    Metin Yazgi
    Analog Integrated Circuits and Signal Processing, 2016, 89 : 685 - 691
  • [47] Improving Classification Performance for Malware Detection Using Genetic Programming Feature Selection Techniques
    Harahsheh, Heba
    Alshraideh, Mohammad
    Al-Sharaeh, Saleh
    Al-Sayyed, Rizik
    JOURNAL OF APPLIED SECURITY RESEARCH, 2023, 18 (03) : 627 - 647
  • [48] Improving Cross-Validation Based Classifier Selection using Meta-Learning
    Krijthe, Jesse H.
    Ho, Tin Kam
    Loog, Marco
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 2873 - 2876
  • [49] Improving the accuracy of diagnosing and predicting coronary heart disease using ensemble method and feature selection techniques
    Asif, Sohaib
    Wenhui, Yi
    ul Ain, Qurrat
    Yueyang, Yi
    Jinhai, Si
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (02): : 1927 - 1946
  • [50] Improving the accuracy of diagnosing and predicting coronary heart disease using ensemble method and feature selection techniques
    Sohaib Asif
    Yi Wenhui
    Qurrat ul Ain
    Yi Yueyang
    Si Jinhai
    Cluster Computing, 2024, 27 : 1927 - 1946