Regression-Based Approach to Test Missing Data Mechanisms

被引:6
|
作者
Rouzinov, Serguei [1 ]
Berchtold, Andre [2 ,3 ]
机构
[1] Statistique Vaud, CH-1003 Lausanne, Switzerland
[2] Univ Lausanne, Inst Social Sci, CH-1015 Lausanne, Switzerland
[3] Univ Lausanne, NCCR LIVES, CH-1015 Lausanne, Switzerland
基金
瑞士国家科学基金会;
关键词
distribution; Dixon test; generating mechanisms; Jamshidian and Jalal test; Little test; missing data; regression; MULTIVARIATE NORMALITY; LONGITUDINAL DATA; DISTRIBUTIONS;
D O I
10.3390/data7020016
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Missing data occur in almost all surveys; in order to handle them correctly it is essential to know their type. Missing data are generally divided into three types (or generating mechanisms): missing completely at random, missing at random, and missing not at random. The first step to understand the type of missing data generally consists in testing whether the missing data are missing completely at random or not. Several tests have been developed for that purpose, but they have difficulties when dealing with non-continuous variables and data with a low quantity of missing data. Our approach checks whether the missing data are missing completely at random or missing at random using a regression model and a distribution test, and it can be applied to continuous and categorical data. The simulation results show that our regression-based approach tends to be more sensitive to the quantity and the type of missing data than the commonly used methods.
引用
收藏
页数:28
相关论文
共 50 条
  • [1] Regression-based imputation of explanatory discrete missing data
    Hernandez-Herrera, Gilma
    Navarro, Albert
    Morina, David
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2022,
  • [2] An adaptive functional regression-based prognostic model for applications with missing data
    Fang, Xiaolei
    Zhou, Rensheng
    Gebraeel, Nagi
    [J]. RELIABILITY ENGINEERING & SYSTEM SAFETY, 2015, 133 : 266 - 274
  • [3] Framework for regression-based missing data imputation methods in on-line MSPC
    Arteaga, F
    Ferrer, A
    [J]. JOURNAL OF CHEMOMETRICS, 2005, 19 (08) : 439 - 447
  • [4] Complementing real datasets with simulated data: a regression-based approach
    M. A. Ortiz-Barrios
    J. Lundström
    J. Synnott
    E. Järpe
    A. Sant’Anna
    [J]. Multimedia Tools and Applications, 2020, 79 : 34301 - 34324
  • [5] Complementing real datasets with simulated data: a regression-based approach
    Ortiz-Barrios, M. A.
    Lundstrom, J.
    Synnott, J.
    Jarpe, E.
    Sant'Anna, A.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (45-46) : 34301 - 34324
  • [6] A Robust Approach of Regression-Based Statistical Matching for Continuous Data
    Sohn, Sooncheol
    Jhun, Myoungshic
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2012, 25 (02) : 331 - 339
  • [7] Task reduction using regression-based missing data imputation in sparse mobile crowdsensing
    Marchang, Ningrinla
    Meitei, Goldie M.
    Thakur, Tejendra
    [J]. JOURNAL OF SUPERCOMPUTING, 2022, 78 (14): : 15995 - 16028
  • [8] Task reduction using regression-based missing data imputation in sparse mobile crowdsensing
    Ningrinla Marchang
    Goldie M. Meitei
    Tejendra Thakur
    [J]. The Journal of Supercomputing, 2022, 78 : 15995 - 16028
  • [9] How to Define and Test an Indirect Moderation Model: The Missing Link in Regression-Based Path Models
    van Kollenburg, Geert H.
    Croon, Marcel A.
    [J]. METHODOLOGY-EUROPEAN JOURNAL OF RESEARCH METHODS FOR THE BEHAVIORAL AND SOCIAL SCIENCES, 2022, 18 (03) : 164 - 184
  • [10] A Regression-Based Approach to Scalability Prediction
    Barnes, Bradley J.
    Rountree, Barry
    Lowenthal, David K.
    Reeves, Jaxk
    de Supinski, Bronis
    Schulz, Martin
    [J]. ICS'08: PROCEEDINGS OF THE 2008 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, 2008, : 368 - +