Fitting logistic regression models with contaminated case-control data

被引:1
|
作者
Cheng, K. F. [1 ]
Chen, L. C.
机构
[1] Natl Cent Univ, Grad Inst Stat, Chungli, Taiwan
[2] Tamkang Univ, Dept Stat, Taipei, Taiwan
关键词
case-control data; contamination; logistic regression; maximum likelihood; misclassification;
D O I
10.1016/j.jspi.2005.07.009
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Errors in measurement frequently occur in observing responses. If case-control data are based on certain reported responses, which may not be the true responses, then we have contaminated case-control data. In this paper, we first show that the ordinary logistic regression analysis based on contaminated case-control data can lead to very serious biased conclusions. This can be concluded from the results of a theoretical argument, one example, and two simulation studies. We next derive the semiparametric maximum likelihood estimate (MLE) of the risk parameter of a logistic regression model when there is a validation subsample. The asymptotic normality of the semiparametric MLE will be shown along with consistent estimate of asymptotic variance. Our example and two simulation studies show these estimates to have reasonable performance under finite sample situations. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:4147 / 4160
页数:14
相关论文
共 50 条
  • [21] A chi-squared goodness-of-fit test for logistic regression models based on case-control data
    Zhang, B
    [J]. BIOMETRIKA, 1999, 86 (03) : 531 - 539
  • [22] Bayesian multiple logistic regression for case-control GWAS
    Banerjee, Saikat
    Zeng, Lingyao
    Schunkert, Heribert
    Soeding, Johannes
    [J]. PLOS GENETICS, 2018, 14 (12):
  • [23] Bias-corrected maximum semiparametric likelihood estimation under logistic regression models based on case-control data
    Zhang, B
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2006, 136 (01) : 108 - 124
  • [25] Assessing the fit of the logistic regression model to individual matched sets of case-control data
    Bedrick, EJ
    Hill, JR
    [J]. BIOMETRICS, 1996, 52 (01) : 1 - 9
  • [26] USING OF STRATIFICATION AND THE LOGISTIC-REGRESSION MODEL IN THE ANALYSIS OF DATA OF CASE-CONTROL STUDIES
    GIMENO, SGA
    DESOUZA, JMP
    [J]. REVISTA DE SAUDE PUBLICA, 1995, 29 (04): : 283 - 289
  • [27] Unconditional or Conditional Logistic Regression Model for Age-Matched Case-Control Data?
    Kuo, Chia-Ling
    Duan, Yinghui
    Grady, James
    [J]. FRONTIERS IN PUBLIC HEALTH, 2018, 6
  • [28] Fitting semiparametric accelerated failure time models for nested case-control data
    Kang, Sangwook
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2017, 87 (04) : 652 - 663
  • [29] A study on the effects of unbalanced data when fitting logistic regression models in ecology
    Salas-Eljatib, Christian
    Fuentes-Ramirez, Andres
    Gregoire, Timothy G.
    Altamirano, Adison
    Yaitul, Valeska
    [J]. ECOLOGICAL INDICATORS, 2018, 85 : 502 - 508
  • [30] LOGISTIC DISEASE INCIDENCE MODELS AND CASE-CONTROL STUDIES
    PRENTICE, RL
    PYKE, R
    [J]. BIOMETRIKA, 1979, 66 (03) : 403 - 411