Regression-based imputation of explanatory discrete missing data

被引:1
|
作者
Hernandez-Herrera, Gilma [1 ,2 ]
Navarro, Albert [1 ]
Morina, David [3 ]
机构
[1] Univ Autonoma Barcelona, Res Grp Psychosocial Risks, Unitat Bioestadist, Fac Med,Org Work & Hlth POWAH, Barcelona, Spain
[2] Univ Antioquia, Fac Med, Inst Invest Med, Medellin, Colombia
[3] Univ Barcelona, Dept Econometr Stat & Appl Econ, Riskctr IREA, Barcelona, Spain
关键词
COMPoisson; Count data; Hermite; Missing data; Multiple imputation; Zero-inflated; ZERO-INFLATED POISSON; GENERALIZED HERMITE; SEMIPARAMETRIC ESTIMATION; MULTIPLE IMPUTATION; MODEL;
D O I
10.1080/03610918.2022.2149805
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Imputation of missing values is a strategy for handling non-responses in surveys or data loss in measurement processes, which may be more effective than ignoring the losses and omitting them. The characteristics of variables presenting missing values must be considered when choosing the imputation method to be used; in particular when the variable is a count the literature dealing with this issue is scarce. If the variable has an excess of zeros it is necessary to consider models including parameters for handling zero-inflation. Likewise, if problems of over- or under-dispersion are observed, generalizations of the Poisson, such as the Hermite or Conway-Maxwell Poisson distributions are recommended for carrying out imputation. The aim of this study was to assess the performance of various regression models in the imputation of a discrete variable based on Poisson generalizations, in comparison with classical counting models, through a comprehensive simulation study considering a variety of scenarios and a real data example. To do so we compared the results of estimations using only complete data, and using imputations based on the most common count models. The COMPoisson distribution provides in general better results in any dispersion scenario, especially when the amount of missing information is large.
引用
收藏
页码:4363 / 4379
页数:17
相关论文
共 50 条
  • [21] A Novel Missing Data Imputation Approach for Time Series Air Quality Data Based on Logistic Regression
    Chen, Mei
    Zhu, Hongyu
    Chen, Yongxu
    Wang, Youshuai
    [J]. ATMOSPHERE, 2022, 13 (07)
  • [22] IMPUTATION OF MISSING DATA
    Lunt, M.
    [J]. ANNALS OF THE RHEUMATIC DISEASES, 2014, 73 : 49 - 49
  • [23] A regression-based approach for measuring similarity in discrete signals
    Hassanpour, Hamid
    Darvishi, Ali
    Khalili, Atena
    [J]. INTERNATIONAL JOURNAL OF ELECTRONICS, 2011, 98 (09) : 1141 - 1156
  • [24] Regression-based detection of missing boundaries in multiphase polycrystalline microstructures
    Prabakar, Manoj
    Amos, Prince Gideon Kubendran
    [J]. PHILOSOPHICAL MAGAZINE LETTERS, 2023, 103 (01)
  • [25] Development of Imputation Methods for Missing Data in Multiple Linear Regression Analysis
    Thidarat Thongsri
    Klairung Samart
    [J]. Lobachevskii Journal of Mathematics, 2022, 43 : 3390 - 3399
  • [26] Composite Imputation Method for the Multiple Linear Regression with Missing at Random Data
    Thongsri, Thidarat
    Samart, Klairung
    [J]. INTERNATIONAL JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE, 2022, 17 (01): : 51 - 62
  • [27] Methods for the analysis of explanatory linear regression models with missing data not at random
    Pastor, JBN
    [J]. QUALITY & QUANTITY, 2003, 37 (04) : 363 - 376
  • [28] Methods for the Analysis of Explanatory Linear Regression Models with Missing Data Not at Random
    José Blas Navarro Pastor
    [J]. Quality and Quantity, 2003, 37 (4) : 363 - 376
  • [29] Using multiple imputation to estimate missing data in meta-regression
    Ellington, E. Hance
    Bastille-Rousseau, Guillaume
    Austin, Cayla
    Landolt, Kristen N.
    Pond, Bruce A.
    Rees, Erin E.
    Robar, Nicholas
    Murray, Dennis L.
    [J]. METHODS IN ECOLOGY AND EVOLUTION, 2015, 6 (02): : 153 - 163
  • [30] Development of Imputation Methods for Missing Data in Multiple Linear Regression Analysis
    Thongsri, Thidarat
    Samart, Klairung
    [J]. LOBACHEVSKII JOURNAL OF MATHEMATICS, 2022, 43 (11) : 3390 - 3399