Imputation with the R Package VIM

被引:355
|
作者
Kowarik, Alexander [1 ]
Templ, Matthias [1 ,2 ]
机构
[1] Stat Austria, Methods Unit, A-1110 Vienna, Austria
[2] Vienna Univ Technol, Vienna, Austria
来源
JOURNAL OF STATISTICAL SOFTWARE | 2016年 / 74卷 / 07期
关键词
missing values; imputation methods; R; MULTIPLE IMPUTATION; MISSING DATA;
D O I
10.18637/jss.v074.i07
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The package VIM (Templ, Alfons, Kowarik, and Prantner 2016) is developed to explore and analyze the structure of missing values in data using visualization methods, to impute these missing values with the built-in imputation methods and to verify the imputation process using visualization tools, as well as to produce high-quality graphics for publications. This article focuses on the different imputation techniques available in the package. Four different imputation methods are currently implemented in VIM, namely hot-deck imputation, k-nearest neighbor imputation, regression imputation and iterative robust model-based imputation (Templ, Kowarik, and Filzmoser 2011). All of these methods are implemented in a flexible manner with many options for customization. Furthermore in this article practical examples are provided to highlight the use of the implemented methods on real-world applications. In addition, the graphical user interface of VIM has been re-implemented from scratch resulting in the package VIMGUI (Schopfhauser, Templ, Alfons, Kowarik, and Prantner 2016) to enable users without extensive R skills to access these imputation and visualization methods.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] yaImpute:: An R package for kNN imputation
    Crookston, Nicholas L.
    Finley, Andrew O.
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2008, 23 (10):
  • [2] imputeqc: an R package for assessing imputation quality of genotypes and optimizing imputation parameters
    Gennady V. Khvorykh
    Andrey V. Khrunin
    [J]. BMC Bioinformatics, 21
  • [3] imputeqc: an R package for assessing imputation quality of genotypes and optimizing imputation parameters
    Khvorykh, Gennady, V
    Khrunin, Andrey, V
    [J]. BMC BIOINFORMATICS, 2020, 21 (Suppl 12)
  • [4] FHDI: An R Package for Fractional Hot Deck Imputation
    Im, Jongho
    Cho, In Ho
    Kim, Jae Kwang
    [J]. R JOURNAL, 2018, 10 (01): : 140 - 154
  • [5] Integration and imputation of survey data in R: the StatMatch package
    D'Orazio, Marcello
    [J]. ROMANIAN STATISTICAL REVIEW, 2015, (02) : 57 - 68
  • [6] imputeqc: an R package for assession and optimization of genotype imputation parameters
    Khvorykh, Gennady V.
    Khrunin, Andrey V.
    [J]. BMC BIOINFORMATICS, 2019, 20
  • [7] Multiple Imputation of Multilevel Missing Data: An Introduction to the R Package pan
    Grund, Simon
    Luedtke, Oliver
    Robitzsch, Alexander
    [J]. SAGE OPEN, 2016, 6 (04):
  • [8] R Package imputeTestbench to Compare Imputation Methods for Univariate Time Series
    Beck, Marcus W.
    Bokde, Neeraj
    Asencio-Cortes, Gualberto
    Kulat, Kishore
    [J]. R JOURNAL, 2018, 10 (01): : 218 - 233
  • [9] The R Package hmi: A Convenient Tool for Hierarchical Multiple Imputation and Beyond
    Speidel, Matthias
    Drechsler, Joerg
    Jolani, Shahab
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2020, 95 (09): : 1 - 48
  • [10] imputomics: web server and R package for missing values imputation in metabolomics data
    Chilimoniuk, Jaroslaw
    Grzesiak, Krystyna
    Kala, Jakub
    Nowakowski, Dominik
    Kretowski, Adam
    Kolenda, Rafal
    Ciborowski, Michal
    Burdukiewicz, Michal
    [J]. BIOINFORMATICS, 2024, 40 (03)