The effect of sample size and missingness on inference with missing data

被引:0
|
作者
Morimoto, Julian [1 ]
机构
[1] Harvard Univ, Cambridge, MA 02138 USA
关键词
Incomplete data; sample size and missing data mechanism; partial likelihood; asymptotic inference with missing data; MULTIPLE IMPUTATION; LIKELIHOOD;
D O I
10.1080/03610926.2022.2152287
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
When are inferences (whether Direct-Likelihood, Bayesian, or Frequentist) obtained from partial data valid? This article answers this question by offering a new asymptotic theory about inference with missing data that is more general than existing theories. It proves that as the sample size increases and the extent of missingness decreases, the average-loglikelihood function generated by partial data and that ignores the missingness mechanism will converge in probability to that which would have been generated by complete data; and if the data are Missing at Random, this convergence depends only on sample size. Thus, inferences from partial data, such as posterior modes, confidence intervals, likelihood ratios, test statistics, and indeed, all quantities or features derived from the partial-data loglikelihood function, will be consistently estimated. Additionally, the missing data mechanism has asymptotically no effect on parameter estimation and hypothesis testing if the data are Missing at Random. This adds to previous research which has only proved the consistency and asymptotic normality of the posterior mode. Practical implications are discussed, and the theory is illustrated through simulation using a previous study of International Human Rights Law.
引用
收藏
页码:3292 / 3311
页数:20
相关论文
共 50 条
  • [31] Causal inference in survival analysis under deterministic missingness of confounders in register data
    Ciocanea-Teodorescu, Iuliana
    Goetghebeur, Els
    Waernbaum, Ingeborg
    Schon, Staffan
    Gabriel, Erin E.
    STATISTICS IN MEDICINE, 2023, : 1946 - 1964
  • [32] Missing data and sensitivity analysis for binary data with implications for sample size and power of randomized clinical trials
    Cook, Thomas
    Zea, Ryan
    STATISTICS IN MEDICINE, 2020, 39 (02) : 192 - 204
  • [33] Robust and efficient estimation for the treatment effect in causal inference and missing data problems
    Lin, Huazhen
    Zhou, Fanyin
    Wang, Qiuxia
    Zhou, Ling
    Qin, Jing
    JOURNAL OF ECONOMETRICS, 2018, 205 (02) : 363 - 380
  • [34] APPROXIMATELY CALIBRATED SMALL SAMPLE INFERENCE ABOUT MEANS FROM BIVARIATE NORMAL DATA WITH MISSING VALUES
    LITTLE, RJA
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1988, 7 (02) : 161 - 178
  • [35] An empirical likelihood inference for the coefficient difference of a two-sample linear model with missing response data
    Yu, Wei
    Niu, Cuizhen
    Xu, Wangli
    METRIKA, 2014, 77 (05) : 675 - 693
  • [36] An empirical likelihood inference for the coefficient difference of a two-sample linear model with missing response data
    Wei Yu
    Cuizhen Niu
    Wangli Xu
    Metrika, 2014, 77 : 675 - 693
  • [37] Sample rotation theory with missing data
    Zou, GH
    Feng, SY
    Qin, HZ
    SCIENCE IN CHINA SERIES A-MATHEMATICS PHYSICS ASTRONOMY, 2002, 45 (01): : 42 - 63
  • [38] Bayesian Nonparametrics for Causal Inference and Missing Data
    Hahn, P. Richard
    Daniels, Michael J.
    Linero, Antonio
    Roy, Jason
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024,
  • [39] IDENTIFICATION AND INFERENCE WITH NONIGNORABLE MISSING COVARIATE DATA
    Miao, Wang
    Tchetgen, Eric Tchetgen
    STATISTICA SINICA, 2018, 28 (04) : 2049 - 2067
  • [40] Sample rotation theory with missing data
    Guohua Zou
    Shiyong Feng
    Huaizhen Qin
    Science in China Series A: Mathematics, 2002, 45 (1): : 42 - 63