Missing Data in Experiments: Challenges and Solutions

被引:18
|
作者
Gomila, Robin [1 ]
Clark, Chelsey S. [1 ]
机构
[1] Princeton Univ, Dept Psychol, Peretsman Scully Hall, Princeton, NJ 08544 USA
关键词
missing data; attrition; experiment; inverse probability weighting; double sampling and bounds; NONRESPONSE; OUTCOMES; WEIGHTS; BOUNDS;
D O I
10.1037/met0000361
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Translational Abstract Researchers rarely manage to collect every piece of information about each participant in their study. For instance, participants sometimes refuse to answer questions that they consider sensitive (e.g., income, political orientation, sexual practices) or quit the study before completing it. If ignored or handled inappropriately, this phenomenon referred to as "missingness" generally compromises researchers' ability to make causal inferences based on their experiments. Specifically, missingness biases researchers' estimates of the effect size of the treatment. In this tutorial, we review the different ways in which missingness impacts the results of experimental studies and provide researchers with concrete steps for addressing each type of missingness they may encounter. For mild cases of missingness, we recommend using a method called inverse probability weighting (IPW). For severe instances of missingness, we recommend that researchers recontact a sample of participants with missing values to fill the gaps. This method, which involves recollecting data, is called double sampling and bounds. For both methods, we provide lines of R code that researchers may use in their own analyses. Missing data is a common feature of experimental datasets. Standard methods used by psychology researchers to handle missingness rely on unrealistic assumptions, invalidate random assignment procedures, and bias estimates of effect sizes. We describe different classes of missing data typically encountered in experimental datasets, and we discuss how each of them impacts researchers' causal inferences. In this tutorial, we provide concrete guidelines for handling each class of missingness, focusing on 2 methods that make realistic assumptions: (a) inverse probability weighting (IPW) for mild instances of missingness, and (b) double sampling and bounds for severe instances of missingness. After reviewing the reasons why these methods increase the accuracy of researchers' estimates of effect sizes, we provide lines of R code that researchers may use in their own analyses.
引用
收藏
页码:143 / 155
页数:13
相关论文
共 50 条
  • [1] Experiments on ensembles with missing and noisy data
    Melville, P
    Shah, N
    Mihalkova, L
    Mooney, RJ
    [J]. MULTIPLE CLASSIFIER SYSTEMS, PROCEEDINGS, 2004, 3077 : 293 - 302
  • [2] Robustness of designed experiments against missing data
    Lal, K
    Gupta, VK
    Bhar, L
    [J]. JOURNAL OF APPLIED STATISTICS, 2001, 28 (01) : 63 - 79
  • [3] ANALYSIS OF EXPERIMENTS WITH MISSING DATA - DODGE,Y
    GILLPATRICK, TR
    [J]. JOURNAL OF MARKETING RESEARCH, 1987, 24 (01) : 130 - 131
  • [4] Handling Missing Data in Randomized Experiments with Noncompliance
    Booil Jo
    Elizabeth M. Ginexi
    Nicholas S. Ialongo
    [J]. Prevention Science, 2010, 11 : 384 - 396
  • [5] Handling Missing Data in Randomized Experiments with Noncompliance
    Jo, Booil
    Ginexi, Elizabeth M.
    Ialongo, Nicholas S.
    [J]. PREVENTION SCIENCE, 2010, 11 (04) : 384 - 396
  • [6] Missing data - mechanisms and possible solutions
    Bar, Haim
    [J]. CULTURA Y EDUCACION, 2017, 29 (03): : 492 - 525
  • [7] Missing Data in OLAP Cubes: Challenges and Strategies
    Tremblay, Monica Chiarini
    Hevner, Alan R.
    [J]. JOURNAL OF DATABASE MANAGEMENT, 2021, 32 (03) : 1 - 28
  • [8] Addressing the Challenges of Missing Data in Anthropological Networks
    Ready, Elspeth
    Hazel, Ashley
    Jones, James Holland
    [J]. AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY, 2018, 165 : 221 - 221
  • [9] ZERO MISSING NONEXISTING DATA - MISSING DATA PROBLEMS IN LONGITUDINAL RESEARCH AND CATEGORICAL-DATA SOLUTIONS
    VONEYE, A
    [J]. CHILDREN AT RISK : ASSESSMENT, LONGITUDINAL RESEARCH, AND INTERVENTION, 1989, 7 : 336 - 355
  • [10] Adjusting the EM algorithm for design of experiments with missing data
    Dodge, Y
    [J]. ITI 2004: PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES, 2004, : 9 - 12