Detecting potential outliers in longitudinal data with time-dependent covariates

被引:3
|
作者
Mramba, Lazarus K. [1 ]
Liu, Xiang [1 ]
Lynch, Kristian F. [1 ]
Yang, Jimin [1 ]
Aronsson, Carin Andren [2 ,3 ]
Hummel, Sandra [4 ,5 ,6 ]
Norris, Jill M. [7 ]
Virtanen, Suvi M. [8 ,9 ,10 ,11 ,12 ]
Hakola, Leena [11 ,12 ]
Uusitalo, Ulla M. [1 ]
Krischer, Jeffrey P. [1 ]
机构
[1] Univ S Florida, Hlth Informat Inst, Morsani Coll Med, Tampa, FL USA
[2] Lund Univ, Dept Clin Sci, Malmo, Sweden
[3] Skane Univ Hosp, Dept Pediat, Malmo, Sweden
[4] Helmholtz Zent, Inst Diabet Res, Munich, Germany
[5] Tech Univ, Klinikum Rechts Isar, Forschergrp Diabet, Munich, Germany
[6] Forschergrp Diabet e V, Munich, Germany
[7] Univ Colorado Anschutz Med Campus, Colorado Sch Publ Hlth, Dept Epidemiol, Aurora, CO USA
[8] Finnish Inst Hlth & Welf, Hlth & Well Being Promot Unit, Helsinki, Finland
[9] Univ Tampere, Ctr Child Hlth Res, Tampere, Finland
[10] Tampere Univ Hosp, Tampere, Finland
[11] Tampere Univ, Fac Social Sci, Unit Hlth Sci, Tampere, Finland
[12] Tampere Univ Hosp, Wellbeing Serv Cty Pirkanmaa, Tampere, Finland
关键词
IMPLAUSIBLE VALUES;
D O I
10.1038/s41430-023-01393-6
中图分类号
R15 [营养卫生、食品卫生]; TS201 [基础科学];
学科分类号
100403 ;
摘要
BackgroundOutliers can influence regression model parameters and change the direction of the estimated effect, over-estimating or under-estimating the strength of the association between a response variable and an exposure of interest. Identifying visit-level outliers from longitudinal data with continuous time-dependent covariates is important when the distribution of such variable is highly skewed.ObjectivesThe primary objective was to identify potential outliers at follow-up visits using interquartile range (IQR) statistic and assess their influence on estimated Cox regression parameters.MethodsStudy was motivated by a large TEDDY dietary longitudinal and time-to-event data with a continuous time-varying vitamin B12 intake as the exposure of interest and development of Islet Autoimmunity (IA) as the response variable. An IQR algorithm was applied to the TEDDY dataset to detect potential outliers at each visit. To assess the impact of detected outliers, data were analyzed using the extended time-dependent Cox model with robust sandwich estimator. Partial residual diagnostic plots were examined for highly influential outliers.ResultsExtreme vitamin B12 observations that were cases of IA had a stronger influence on the Cox regression model than non-cases. Identified outliers changed the direction of hazard ratios, standard errors, or the strength of association with the risk of developing IA.ConclusionAt the exploratory data analysis stage, the IQR algorithm can be used as a data quality control tool to identify potential outliers at the visit level, which can be further investigated.
引用
收藏
页码:344 / 350
页数:7
相关论文
共 50 条
  • [1] Detecting potential outliers in longitudinal data with time-dependent covariates
    Lazarus K. Mramba
    Xiang Liu
    Kristian F. Lynch
    Jimin Yang
    Carin Andrén Aronsson
    Sandra Hummel
    Jill M. Norris
    Suvi M. Virtanen
    Leena Hakola
    Ulla M. Uusitalo
    Jeffrey P. Krischer
    European Journal of Clinical Nutrition, 2024, 78 : 344 - 350
  • [2] The analysis of binary longitudinal data with time-dependent covariates
    Guerra, Matthew W.
    Shults, Justine
    Amsterdam, Jay
    Ten-Have, Thomas
    STATISTICS IN MEDICINE, 2012, 31 (10) : 931 - 948
  • [3] Improved methods for the marginal analysis of longitudinal data in the presence of time-dependent covariates
    Chen, I-Chen
    Westgate, Philip M.
    STATISTICS IN MEDICINE, 2017, 36 (16) : 2533 - 2546
  • [4] Marginal quantile regression for longitudinal data analysis in the presence of time-dependent covariates
    Chen, I-Chen
    Westgate, Philip M.
    INTERNATIONAL JOURNAL OF BIOSTATISTICS, 2021, 17 (02): : 267 - 282
  • [5] EMPIRICAL LIKELIHOOD APPROACH FOR LONGITUDINAL DATA WITH MISSING VALUES AND TIME-DEPENDENT COVARIATES
    Yan Zhang
    Weiping Zhang
    Xiao Guo
    Annals of Applied Mathematics, 2016, 32 (02) : 200 - 220
  • [6] Regression Analysis of Longitudinal Data with Time-Dependent Covariates and Informative Observation Times
    Song, Xinyuan
    Mu, Xiaoyun
    Sun, Liuquan
    SCANDINAVIAN JOURNAL OF STATISTICS, 2012, 39 (02) : 248 - 258
  • [7] Analysis of time-dependent covariates in failure time data
    Aydemir, L
    Aydemir, S
    Dirschedl, P
    STATISTICS IN MEDICINE, 1999, 18 (16) : 2123 - 2134
  • [8] GMM logistic regression models for longitudinal data with time-dependent covariates and extended classifications
    Lalonde, Trent L.
    Wilson, Jeffrey R.
    Yin, Jianqiong
    STATISTICS IN MEDICINE, 2014, 33 (27) : 4756 - 4769
  • [9] Using modified approaches on marginal regression analysis of longitudinal data with time-dependent covariates
    Zhou, Yi
    Lefante, John
    Rice, Janet
    Chen, Shande
    STATISTICS IN MEDICINE, 2014, 33 (19) : 3354 - 3364
  • [10] Regression analysis of longitudinal binary data with time-dependent environmental covariates: bias and efficiency
    Schildcrout, JS
    Heagerty, PJ
    BIOSTATISTICS, 2005, 6 (04) : 633 - 652