Imputation of left-censored data for cluster analysis

被引:1
|
作者
Liu, Yushan [1 ]
Brown, Steven D. [1 ]
机构
[1] Univ Delaware, Dept Chem & Biochem, Brown Lab, Newark, DE 19716 USA
关键词
multiple imputation; finite mixture models; left-censored data; model-based clustering; DETECTION LIMIT OBSERVATIONS; GAUSSIAN MIXTURE-MODELS; WATER-QUALITY DATA; MISSING DATA; MULTIPLE IMPUTATION; MAXIMUM-LIKELIHOOD; ENVIRONMENTAL DATA; DATA SETS; MULTIVARIATE DATA; ROUNDED ZEROS;
D O I
10.1002/cem.2586
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A new method of imputation for left-censored datasets is reported. This method is evaluated by examining datasets in which the true values of the censored data are known so that the quality of the imputation can be assessed both visually and by means of cluster analysis. Its performance in retaining certain data structures on imputation is compared with that of three other imputation algorithms by using cluster analysis on the imputed data. It is found that the new imputation method benefits a subsequent model-based cluster analysis performed on the left-censored data. The stochastic nature of the imputations performed in the new method can provide multiple imputed sets from the same incomplete data. The analysis of these provides an estimate of the uncertainty of the cluster analysis. Results from clustering suggest that the imputation is robust, with smaller uncertainty than that obtained from other multiple imputation methods applied to the same data. In addition, the use of the new method avoids problems with ill-conditioning of group covariances during imputation as well as in the subsequent clustering based on expectation-maximization. The strong imputation performance of the proposed method on simulated datasets becomes more apparent as the groups in the mixture models are increasingly overlapped. Results from real datasets suggest that the best performance occurs when the requirement of normality of each group is fulfilled, which is the main assumption of the new method. Copyright (c) 2013 John Wiley & Sons, Ltd. A new method of imputation of left-censored datasets is reported. The new imputation method benefits a subsequent model-based cluster analysis performed on the left-censored data. The stochastic nature of the imputations performed in the new method can provide multiple imputed sets from the same incomplete data so that the uncertainty of clustering can be evaluated. The strong imputation performance of the proposed method on simulated and real datasets becomes more apparent as the groups in the mixture models are closer to normally distributed, even though the groups may be overlapped.
引用
收藏
页码:148 / 160
页数:13
相关论文
共 50 条
  • [1] Analysis of left-censored data with zeros
    Gogolak, Carl
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2005, 230 : U2297 - U2297
  • [2] A Multiple Imputation Approach for Estimating Rank Correlation With Left-Censored Data
    Williamson, John M.
    Crawford, Sara B.
    Lin, Hung-Mo
    STATISTICS IN BIOPHARMACEUTICAL RESEARCH, 2010, 2 (04): : 540 - 548
  • [3] A Two-Step Multiple Imputation for Analysis of Repeated Measures With Left-Censored and Missing Data
    Liu, G. Frank
    Hu, Peter
    Mehrotra, Devan V.
    STATISTICS IN BIOPHARMACEUTICAL RESEARCH, 2013, 5 (02): : 116 - 125
  • [4] Multiple imputation for left-censored biomarker data based on Gibbs sampling method
    Lee, MinJae
    Kong, Lan
    Weissfeld, Lisa
    STATISTICS IN MEDICINE, 2012, 31 (17) : 1838 - 1848
  • [5] zCompositions - R Package for multivariate imputation of left-censored data under a compositional approach
    Palarea-Albaladejo, Javier
    Antoni Martin-Fernandez, Josep
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2015, 143 : 85 - 96
  • [6] Assessing assay agreement estimation for multiple left-censored data: a multiple imputation approach
    Lapidus, Nathanael
    Chevret, Sylvie
    Resche-Rigon, Matthieu
    STATISTICS IN MEDICINE, 2014, 33 (30) : 5298 - 5309
  • [7] Assay validation for left-censored data
    Barnhart, HX
    Song, JL
    Lyles, RH
    STATISTICS IN MEDICINE, 2005, 24 (21) : 3347 - 3360
  • [8] The reciprocal Bayesian bridge for left-censored data
    Alhamzawi, Rahim
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023, 52 (08) : 3520 - 3528
  • [9] Covariance matrix estimation for left-censored data
    Pesonen, Maiju
    Pesonen, Henri
    Nevalainen, Jaakko
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2015, 92 : 13 - 25
  • [10] Clustering with missing and left-censored data: A simulation study comparing multiple-imputation-based procedures
    Faucheux, Lilith
    Resche-Rigon, Matthieu
    Curis, Emmanuel
    Soumelis, Vassili
    Chevret, Sylvie
    BIOMETRICAL JOURNAL, 2021, 63 (02) : 372 - 393