Recursive partitioning for missing data imputation in the presence of interaction effects

被引:147
|
作者
Doove, L. L. [1 ,2 ]
Van Buuren, S. [1 ,3 ]
Dusseldorp, E. [2 ,3 ]
机构
[1] Univ Utrecht, Fac Social Sci, Dept Methodol & Stat, NL-3508 TC Utrecht, Netherlands
[2] Katholieke Univ Leuven, Dept Psychol, Louvain, Belgium
[3] TNO, Netherlands Org Appl Sci Res, NL-2301 CE Leiden, Netherlands
关键词
CART; Classification and regression trees; Interaction problem; MICE; Nonlinear relations; Random forests; MULTIPLE IMPUTATION; REGRESSION TREES;
D O I
10.1016/j.csda.2013.10.025
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Standard approaches to implement multiple imputation do not automatically incorporate nonlinear relations like interaction effects. This leads to biased parameter estimates when interactions are present in a dataset. With the aim of providing an imputation method which preserves interactions in the data automatically, the use of recursive partitioning as imputation method is examined. Three recursive partitioning techniques are implemented in the multiple imputation by chained equations framework. It is investigated, using simulated data, whether recursive partitioning creates appropriate variability between imputations and unbiased parameter estimates with appropriate confidence intervals. It is concluded that, when interaction effects are present in a dataset, substantial gains are possible by using recursive partitioning for imputation compared to standard applications. In addition, it is shown that the potential of recursive partitioning imputation approaches depends on the relevance of a possible interaction effect, the correlation structure of the data, and the type of possible interaction effect present in the data. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:92 / 104
页数:13
相关论文
共 50 条
  • [1] Missing data imputation, matching and other applications of random recursive partitioning
    Iacus, Stefano A.
    Porro, Giuseppe
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (02) : 773 - 789
  • [2] Variable selection in the presence of missing data: resampling and imputation
    Long, Qi
    Johnson, Brent A.
    [J]. BIOSTATISTICS, 2015, 16 (03) : 596 - 610
  • [3] Recursive partitioning on incomplete data using surrogate decisions and multiple imputation
    Hapfelmeier, A.
    Hothorn, T.
    Ulm, K.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2012, 56 (06) : 1552 - 1565
  • [4] Exploring the Effects of Data Distribution in Missing Data Imputation
    Soares, Jastin Pompeu
    Santos, Miriam Seoane
    Abreu, Pedro Henriques
    Araujo, Helder
    Santos, Joao
    [J]. ADVANCES IN INTELLIGENT DATA ANALYSIS XVII, IDA 2018, 2018, 11191 : 251 - 263
  • [5] Efficiency of multiple imputation to test for association in the presence of missing data
    Pascal Croiseau
    Claire Bardel
    Emmanuelle Génin
    [J]. BMC Proceedings, 1 (Suppl 1)
  • [6] A Technique of Recursive Reliability-Based Missing Data Imputation for Collaborative Filtering
    Ihm, Sun-Young
    Lee, Shin-Eun
    Park, Young-Ho
    Nasridinov, Aziz
    Kim, Miyeon
    Park, So-Hyun
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (08):
  • [7] IMPUTATION OF MISSING DATA
    Lunt, M.
    [J]. ANNALS OF THE RHEUMATIC DISEASES, 2014, 73 : 49 - 49
  • [8] The Effects of Missing Data Characteristics on the Choice of Imputation Techniques
    Alade, Oyekale Abel
    Selamat, Ali
    Sallehuddin, Roselina
    [J]. VIETNAM JOURNAL OF COMPUTER SCIENCE, 2020, 7 (02) : 161 - 177
  • [9] New Imputation Method for Estimating Population Mean in the Presence of Missing Data
    Lawson, Nuanpan
    [J]. LOBACHEVSKII JOURNAL OF MATHEMATICS, 2023, 44 (09) : 3740 - 3748
  • [10] New Imputation Method for Estimating Population Mean in the Presence of Missing Data
    Nuanpan Lawson
    [J]. Lobachevskii Journal of Mathematics, 2023, 44 : 3740 - 3748