gcimpute: A Package for Missing Data Imputation

被引:0
|
作者
Zhao, Yuxuan [1 ]
Udell, Madeleine [2 ]
机构
[1] Cornell Univ, Dept Stat & Data Sci, Ithaca, NY 14850 USA
[2] Stanford Univ, Management Sci & Engn, Stanford, CA 94305 USA
来源
JOURNAL OF STATISTICAL SOFTWARE | 2024年 / 108卷 / 04期
关键词
missing data; single imputation; multiple imputation; Gaussian copula; mixed data; imputation uncertainty; !text type='Python']Python[!/text;
D O I
10.18637/jss.v108.i04
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This article introduces the Python package gcimpute for missing data imputation. Package gcimpute can impute missing data with many different variable types, including continuous, binary, ordinal, count, and truncated values, by modeling data as samples from a Gaussian copula model. This semiparametric model learns the marginal distribution of each variable to match the empirical distribution, yet describes the interactions between variables with a joint Gaussian that enables fast inference, imputation with confidence intervals, and multiple imputation. The package also provides specialized extensions to handle large datasets (with complexity linear in the number of observations) and streaming datasets (with online imputation). This article describes the underlying methodology and demonstrates how to use the software package.
引用
收藏
页码:1 / 27
页数:27
相关论文
共 50 条
  • [1] Multiple Imputation of Multilevel Missing Data: An Introduction to the R Package pan
    Grund, Simon
    Luedtke, Oliver
    Robitzsch, Alexander
    SAGE OPEN, 2016, 6 (04):
  • [2] IMPUTATION OF MISSING DATA
    Lunt, M.
    ANNALS OF THE RHEUMATIC DISEASES, 2014, 73 : 49 - 49
  • [3] imputomics: web server and R package for missing values imputation in metabolomics data
    Chilimoniuk, Jaroslaw
    Grzesiak, Krystyna
    Kala, Jakub
    Nowakowski, Dominik
    Kretowski, Adam
    Kolenda, Rafal
    Ciborowski, Michal
    Burdukiewicz, Michal
    BIOINFORMATICS, 2024, 40 (03)
  • [4] Missing data imputation: focusing on single imputation
    Zhang, Zhongheng
    ANNALS OF TRANSLATIONAL MEDICINE, 2016, 4 (01)
  • [5] Missing Data: data replacement and imputation
    Hutcheson, Graeme
    Pampaka, Maria
    JOURNAL OF MODELLING IN MANAGEMENT, 2012, 7 (02)
  • [6] Missing Data Imputation: A Survey
    Kelkar, Bhagyashri Abhay
    INTERNATIONAL JOURNAL OF DECISION SUPPORT SYSTEM TECHNOLOGY, 2022, 14 (01)
  • [7] Missing Data and Imputation Methods
    Schober, Patrick
    Vetter, Thomas R.
    ANESTHESIA AND ANALGESIA, 2020, 131 (05): : 1419 - 1420
  • [8] Missing Data and Multiple Imputation
    Cummings, Peter
    JAMA PEDIATRICS, 2013, 167 (07) : 656 - 661
  • [10] Missing data, imputation, and endogeneity
    McDonough, Ian K.
    Millimet, Daniel L.
    JOURNAL OF ECONOMETRICS, 2017, 199 (02) : 141 - 155