The effect of sample size and missingness on inference with missing data

被引:0
|
作者
Morimoto, Julian [1 ]
机构
[1] Harvard Univ, Cambridge, MA 02138 USA
关键词
Incomplete data; sample size and missing data mechanism; partial likelihood; asymptotic inference with missing data; MULTIPLE IMPUTATION; LIKELIHOOD;
D O I
10.1080/03610926.2022.2152287
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
When are inferences (whether Direct-Likelihood, Bayesian, or Frequentist) obtained from partial data valid? This article answers this question by offering a new asymptotic theory about inference with missing data that is more general than existing theories. It proves that as the sample size increases and the extent of missingness decreases, the average-loglikelihood function generated by partial data and that ignores the missingness mechanism will converge in probability to that which would have been generated by complete data; and if the data are Missing at Random, this convergence depends only on sample size. Thus, inferences from partial data, such as posterior modes, confidence intervals, likelihood ratios, test statistics, and indeed, all quantities or features derived from the partial-data loglikelihood function, will be consistently estimated. Additionally, the missing data mechanism has asymptotically no effect on parameter estimation and hypothesis testing if the data are Missing at Random. This adds to previous research which has only proved the consistency and asymptotic normality of the posterior mode. Practical implications are discussed, and the theory is illustrated through simulation using a previous study of International Human Rights Law.
引用
收藏
页码:3292 / 3311
页数:20
相关论文
共 50 条
  • [21] Estimating Equations Inference With Missing Data
    Zhou, Yong
    Wan, Alan T. K.
    Wang, Xiaojing
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (483) : 1187 - 1199
  • [22] Sample size and the fallacies of classical inference
    Friston, Karl J.
    NEUROIMAGE, 2013, 81 : 503 - 504
  • [23] Statistical inference for missing data mechanisms
    Zhao, Yang
    STATISTICS IN MEDICINE, 2020, 39 (28) : 4325 - 4333
  • [24] INFERENCE FOR HETEROSKEDASTIC PCA WITH MISSING DATA
    Yan, Yuling
    Chen, Yuxin
    Fan, Jianqing
    ANNALS OF STATISTICS, 2024, 52 (02): : 729 - 756
  • [25] Causal Inference: A Missing Data Perspective
    Ding, Peng
    Li, Fan
    STATISTICAL SCIENCE, 2018, 33 (02) : 214 - 237
  • [26] SAMPLE-SIZE FOR A PHYLOGENETIC INFERENCE
    CHURCHILL, GA
    VONHAESELER, A
    NAVIDI, WC
    MOLECULAR BIOLOGY AND EVOLUTION, 1992, 9 (04) : 753 - 769
  • [27] An effect of data size on performance of effort estimation with missing data techniques
    Tamura, Koichi
    Monden, Akito
    Matsumoto, Ken-Ichi
    Computer Software, 2010, 27 (02) : 100 - 105
  • [28] A suggestion for best practice for missing data in diary collection: exploring the missingness first
    Skaltsa, Konstantina
    Kral, Pavol
    Reaney, Matthew
    O'Kelly, Michael
    QUALITY OF LIFE RESEARCH, 2020, 29 (SUPPL 1) : S76 - S76
  • [29] What is Missing in Missing Data Handling? An Evaluation of Missingness in and Potential Remedies for Doctoral Dissertations and Subsequent Publications that Use NHANES Data
    Yu, Hairui
    Perumean-Chaney, Suzanne E.
    Kaiser, Kathryn A.
    JOURNAL OF STATISTICS AND DATA SCIENCE EDUCATION, 2024, 32 (01): : 3 - 10
  • [30] Inference in randomized trials with death and missingness
    Wang, Chenguang
    Scharfstein, Daniel O.
    Colantuoni, Elizabeth
    Girard, Timothy D.
    Yan, Ying
    BIOMETRICS, 2017, 73 (02) : 431 - 440