Analyzing incomplete political science data: An alternative algorithm for multiple imputation

被引:1024
|
作者
King, G [1 ]
Honaker, J
Joseph, A
Scheve, K
机构
[1] Harvard Univ, Ctr Basic Res Social Sci, World Hlth Org, Global Programme Evidence Hlth Policy, Cambridge, MA 02138 USA
[2] Harvard Univ, Ctr Basic Res Social Sci, Dept Govt, Cambridge, MA 02138 USA
[3] Yale Univ, Inst Social & Policy Studies, Dept Polit Sci, New Haven, CT 06520 USA
关键词
D O I
10.1017/S0003055401000235
中图分类号
D0 [政治学、政治理论];
学科分类号
0302 ; 030201 ;
摘要
We propose a remedy for the discrepancy between the way political scientists analyze data with missing values and the recommendations of the statistics community. Methodologists and statisticians agree that "multiple imputation" is a superior approach to the problem of missing data scattered through one's explanatory and dependent variables than the methods currently used in applied data analysis. The discrepancy occurs because the computational algorithms used to apply the best multiple imputation models have been slow, difficult to implement, impossible to run with existing commercial statistical packages, and have demanded considerable expertise. We adapt an algorithm and use it to implement a general-purpose, multiple imputation model for missing data. This algorithm is considerably faster and easier to use than the leading method recommended in the statistics literature. We also quantify the risks of current missing data practices, illustrate how to use the new procedure, and evaluate this alternative through simulated data as well as actual empirical examples. Finally, we offer easy-to-use software that implements all methods discussed.
引用
收藏
页码:49 / 69
页数:21
相关论文
共 50 条
  • [1] Multiple imputation for incomplete data with semicontinuous variables
    Javaras, KN
    Van Dyk, DA
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2003, 98 (463) : 703 - 715
  • [2] A multiple imputation strategy for incomplete longitudinal data
    Landrum, MB
    Becker, MP
    [J]. STATISTICS IN MEDICINE, 2001, 20 (17-18) : 2741 - 2760
  • [3] Multiple Imputation for Incomplete Data in Epidemiologic Studies
    Harel, Ofer
    Mitchell, Emily M.
    Perkins, Neil J.
    Cole, Stephen R.
    Tchetgen, Eric J. Tchetgen
    Sun, BaoLuo
    Schisterman, Enrique F.
    [J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 2018, 187 (03) : 576 - 584
  • [4] An Improved Mean Imputation Clustering Algorithm for Incomplete Data
    Shi, Hong
    Wang, Pingxin
    Yang, Xin
    Yu, Hualong
    [J]. NEURAL PROCESSING LETTERS, 2022, 54 (05) : 3537 - 3550
  • [5] An Improved Mean Imputation Clustering Algorithm for Incomplete Data
    Hong Shi
    Pingxin Wang
    Xin Yang
    Hualong Yu
    [J]. Neural Processing Letters, 2022, 54 : 3537 - 3550
  • [6] Multiple Imputation and Genetic Programming for Classification with Incomplete Data
    Cao Truong Tran
    Zhang, Mengjie
    Andreae, Peter
    Xue, Bing
    [J]. PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'17), 2017, : 521 - 528
  • [7] Multiple Imputation and Ensemble Learning for Classification with Incomplete Data
    Cao Truong Tran
    Zhang, Mengjie
    Andreae, Peter
    Xue, Bing
    Lam Thu Bui
    [J]. INTELLIGENT AND EVOLUTIONARY SYSTEMS, IES 2016, 2017, 8 : 401 - 415
  • [8] Multiple Imputation for Incomplete Data in Environmental Epidemiology Research
    Prince Addo Allotey
    Ofer Harel
    [J]. Current Environmental Health Reports, 2019, 6 : 62 - 71
  • [9] Multiple Imputation for Incomplete Data in Environmental Epidemiology Research
    Allotey, Prince Addo
    Harel, Ofer
    [J]. CURRENT ENVIRONMENTAL HEALTH REPORTS, 2019, 6 (02) : 62 - 71
  • [10] A functional multiple imputation approach to incomplete longitudinal data
    He, Yulei
    Yucel, Recai
    Raghunathan, Trivellore E.
    [J]. STATISTICS IN MEDICINE, 2011, 30 (10) : 1137 - 1156