Detecting potential outliers in longitudinal data with time-dependent covariates

被引:3
|
作者
Mramba, Lazarus K. [1 ]
Liu, Xiang [1 ]
Lynch, Kristian F. [1 ]
Yang, Jimin [1 ]
Aronsson, Carin Andren [2 ,3 ]
Hummel, Sandra [4 ,5 ,6 ]
Norris, Jill M. [7 ]
Virtanen, Suvi M. [8 ,9 ,10 ,11 ,12 ]
Hakola, Leena [11 ,12 ]
Uusitalo, Ulla M. [1 ]
Krischer, Jeffrey P. [1 ]
机构
[1] Univ S Florida, Hlth Informat Inst, Morsani Coll Med, Tampa, FL USA
[2] Lund Univ, Dept Clin Sci, Malmo, Sweden
[3] Skane Univ Hosp, Dept Pediat, Malmo, Sweden
[4] Helmholtz Zent, Inst Diabet Res, Munich, Germany
[5] Tech Univ, Klinikum Rechts Isar, Forschergrp Diabet, Munich, Germany
[6] Forschergrp Diabet e V, Munich, Germany
[7] Univ Colorado Anschutz Med Campus, Colorado Sch Publ Hlth, Dept Epidemiol, Aurora, CO USA
[8] Finnish Inst Hlth & Welf, Hlth & Well Being Promot Unit, Helsinki, Finland
[9] Univ Tampere, Ctr Child Hlth Res, Tampere, Finland
[10] Tampere Univ Hosp, Tampere, Finland
[11] Tampere Univ, Fac Social Sci, Unit Hlth Sci, Tampere, Finland
[12] Tampere Univ Hosp, Wellbeing Serv Cty Pirkanmaa, Tampere, Finland
关键词
IMPLAUSIBLE VALUES;
D O I
10.1038/s41430-023-01393-6
中图分类号
R15 [营养卫生、食品卫生]; TS201 [基础科学];
学科分类号
100403 ;
摘要
BackgroundOutliers can influence regression model parameters and change the direction of the estimated effect, over-estimating or under-estimating the strength of the association between a response variable and an exposure of interest. Identifying visit-level outliers from longitudinal data with continuous time-dependent covariates is important when the distribution of such variable is highly skewed.ObjectivesThe primary objective was to identify potential outliers at follow-up visits using interquartile range (IQR) statistic and assess their influence on estimated Cox regression parameters.MethodsStudy was motivated by a large TEDDY dietary longitudinal and time-to-event data with a continuous time-varying vitamin B12 intake as the exposure of interest and development of Islet Autoimmunity (IA) as the response variable. An IQR algorithm was applied to the TEDDY dataset to detect potential outliers at each visit. To assess the impact of detected outliers, data were analyzed using the extended time-dependent Cox model with robust sandwich estimator. Partial residual diagnostic plots were examined for highly influential outliers.ResultsExtreme vitamin B12 observations that were cases of IA had a stronger influence on the Cox regression model than non-cases. Identified outliers changed the direction of hazard ratios, standard errors, or the strength of association with the risk of developing IA.ConclusionAt the exploratory data analysis stage, the IQR algorithm can be used as a data quality control tool to identify potential outliers at the visit level, which can be further investigated.
引用
收藏
页码:344 / 350
页数:7
相关论文
共 50 条
  • [31] BOOSTED NONPARAMETRIC HAZARDS WITH TIME-DEPENDENT COVARIATES
    Lee, Donald K. K.
    Chen, Ningyuan
    Ishwaran, Hemant
    ANNALS OF STATISTICS, 2021, 49 (04): : 2101 - 2128
  • [32] Modeling time-dependent overdispersion in longitudinal count data
    Ye, Fei
    Yue, Chen
    Yang, Ying
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2013, 58 : 257 - 264
  • [33] AN ILLNESS DEATH PROCESS WITH TIME-DEPENDENT COVARIATES
    CHIANG, YK
    HARDY, RJ
    HAWKINS, CM
    KAPADIA, AS
    BIOMETRICS, 1989, 45 (02) : 669 - 681
  • [34] Analysis of Panel Count Data with Time-dependent Covariates and Informative Observation Process
    Sha FANG
    Hai-xiang ZHANG
    Liu-quan SUN
    De-hui WANG
    Acta Mathematicae Applicatae Sinica, 2017, 33 (01) : 147 - 156
  • [35] Analysis of censored survival data with intermittently observed time-dependent binary covariates
    Faucett, CL
    Schenker, N
    Elashoff, RM
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1998, 93 (442) : 427 - 437
  • [36] Parametric survival models for interval-censored data with time-dependent covariates
    Sparling, Yvonne H.
    Younes, Naji
    Lachin, John M.
    Bautista, Oliver M.
    BIOSTATISTICS, 2006, 7 (04) : 599 - 614
  • [37] Survival Data Analysis with Time-Dependent Covariates Using Generalized Additive Models
    Tsujitani, Masaaki
    Tanaka, Yusuke
    Sakon, Masato
    COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2012, 2012
  • [38] Analysis of panel count data with time-dependent covariates and informative observation process
    Sha Fang
    Hai-xiang Zhang
    Liu-quan Sun
    De-hui Wang
    Acta Mathematicae Applicatae Sinica, English Series, 2017, 33 : 147 - 156
  • [40] Analysis of multivariate recurrent event data with time-dependent covariates and informative censoring
    Zhao, Xingqiu
    Liu, Li
    Liu, Yanyan
    Xu, Wei
    BIOMETRICAL JOURNAL, 2012, 54 (05) : 585 - 599