Does data splitting improve prediction?

被引:17
|
作者
Faraway, Julian J. [1 ]
机构
[1] Univ Bath, Dept Math Sci, Bath BA2 7AY, Avon, England
关键词
Cross-validation; Model assessment; Model uncertainty; Model validation; Prediction; Scoring; MODEL SELECTION; VALIDATION; ERROR;
D O I
10.1007/s11222-014-9522-9
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Data splitting divides data into two parts. One part is reserved for model selection. In some applications, the second part is used for model validation but we use this part for estimating the parameters of the chosen model. We focus on the problem of constructing reliable predictive distributions for future observed values. We judge the predictive performance using log scoring. We compare the full data strategy with the data splitting strategy for prediction. We show how the full data score can be decomposed into model selection, parameter estimation and data reuse costs. Data splitting is preferred when data reuse costs are high. We investigate the relative performance of the strategies in four simulation scenarios. We introduce a hybrid estimator that uses one part for model selection but both parts for estimation. We argue that a split data analysis is prefered to a full data analysis for prediction with some exceptions.
引用
收藏
页码:49 / 60
页数:12
相关论文
共 50 条
  • [21] Splitting in-patient and out-patient responsibility does not improve patient care
    Burns, Tom
    Baggaley, Martin
    BRITISH JOURNAL OF PSYCHIATRY, 2017, 210 (01) : 6 - 9
  • [22] Lipoprotein associated phospholipase A2 does not improve mortality prediction
    Benderly, M.
    Sapir, B.
    Kalter-Leibovici, O.
    Zimlichman, R.
    EUROPEAN HEART JOURNAL, 2015, 36 : 812 - 812
  • [23] Does Stalking Behavior Improve Risk Prediction of Intimate Partner Violence?
    Jung, Sandy
    Himmen, Marguerite K.
    Velupillai, Nirudika
    Buro, Karen
    VICTIMS & OFFENDERS, 2022, 17 (04) : 553 - 570
  • [24] DOES ENDOSCOPY IMPROVE PREDICTION OF THE PROGNOSIS IN UPPER GASTROINTESTINAL-BLEEDING
    GARRIGUES, V
    PONCE, J
    MARTINEZ, F
    SALA, T
    PERTEJO, V
    BERENGUER, J
    JOURNAL OF CLINICAL GASTROENTEROLOGY, 1992, 15 (01) : 8 - 11
  • [25] Does Machine Learning Improve Prediction of VA Primary Care Reliance?
    Wong, Edwin S.
    Schuttner, Linnaea
    Reddy, Ashok
    AMERICAN JOURNAL OF MANAGED CARE, 2020, 26 (01): : 40 - 44
  • [26] Review of the quality of total mesorectal excision does not improve the prediction of outcome
    Demetter, P.
    Jouret-Mourin, A.
    Silversmit, G.
    Vandendael, T.
    Sempoux, C.
    Hoorens, A.
    Nagy, N.
    Cuvelier, C.
    Van Damme, N.
    Penninckx, F.
    COLORECTAL DISEASE, 2016, 18 (09) : 883 - 888
  • [27] Incorporating change in MELD does not improve prediction of survival by MELD alone
    Kim, WR
    Bambha, K
    Kremers, WK
    Therneau, TM
    Kamath, PS
    Dickson, ER
    LIVER TRANSPLANTATION, 2003, 9 (06) : C37 - C37
  • [28] Does a comprehensive family history of colorectal cancer improve risk prediction?
    Zheng, Yingye
    Hua, Xinwei
    Win, Aung Ko
    Jenkins, Mark
    Macinnis, Robert
    Newcomb, Polly
    CANCER EPIDEMIOLOGY BIOMARKERS & PREVENTION, 2017, 26 (05)
  • [29] Does Information on Blood Heavy Metals Improve Cardiovascular Mortality Prediction?
    Wang, Xin
    Mukherjee, Bhramar
    Park, Sung Kyun
    JOURNAL OF THE AMERICAN HEART ASSOCIATION, 2019, 8 (21):
  • [30] Does splitting sleep improve long-term memory in chronically sleep deprived adolescents?
    Cousins, James N.
    van Rijn, Elaine
    Ong, Ju Lynn
    Wong, Kian F.
    Chee, Michael W. L.
    NPJ SCIENCE OF LEARNING, 2019, 4 (01)