Mixture model based multivariate statistical analysis of multiply censored environmental data

被引:22
|
作者
He, Jianxun [1 ]
机构
[1] Lakehead Univ, Dept Civil Engn, Thunder Bay, ON P7B 5E1, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Water quality; Gaussian mixture model; Maximum likelihood estimation; Censored data; Detection limit; WATER-QUALITY CONCENTRATIONS; MAXIMUM-LIKELIHOOD;
D O I
10.1016/j.advwatres.2013.05.001
中图分类号
TV21 [水资源调查与水利规划];
学科分类号
081501 ;
摘要
Environmental data are commonly constrained by a detection limit (DL) because of the restriction of experimental apparatus. In particular due to the changes of experimental units or assay methods, the observed data are often cut off by more than one DL. Measurements below the DLs are typically replaced by an arbitrary value such as zeros, half of DLs, or DLs for convenience of analysis. However, this method is widely considered unreliable and prone to bias. In contrast, maximum likelihood estimation (MLE) method for censored data has been developed for better performance and statistical justification. However, the existing MLE methods seldom address the multivariate context of censored environmental data especially for water quality. This paper proposes using a mixture model to flexibly approximate the underlying distribution of the observed data due to its good approximation capability and generation mechanism. In particular, Gaussian mixture model (GMM) is mainly focused in this study. To cope with the censored data with multiple DLs, an expectation-maximization (EM) algorithm in a multivariate setting is developed. The proposed statistical analysis approach is verified from both the simulated data and real water quality data. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:15 / 24
页数:10
相关论文
共 50 条
  • [21] Analysis of two-sample censored data using a semiparametric mixture model
    Li, Gang
    Lin, Chien-tai
    [J]. ACTA MATHEMATICAE APPLICATAE SINICA-ENGLISH SERIES, 2009, 25 (03): : 389 - 398
  • [22] Statistical Analysis of Some Complex Censored Data
    Liu, Huanbin
    Rao, Congjun
    [J]. ADVANCES IN COMPUTER SCIENCE, INTELLIGENT SYSTEM AND ENVIRONMENT, VOL 1, 2011, 104 : 215 - 219
  • [23] Statistical prediction based on censored life data
    Escobar, LA
    Meeker, WQ
    [J]. TECHNOMETRICS, 1999, 41 (02) : 113 - 124
  • [24] The analysis of multivariate interval-censored survival data
    Kim, MY
    Xue, XN
    [J]. STATISTICS IN MEDICINE, 2002, 21 (23) : 3715 - 3726
  • [25] Semiparametric regression analysis of multivariate doubly censored data
    Li, Shuwei
    Hu, Tao
    Tong, Tiejun
    Sun, Jianguo
    [J]. STATISTICAL MODELLING, 2020, 20 (05) : 502 - 526
  • [26] Exploratory tobit factor analysis for multivariate censored data
    Kamakura, WA
    Wedel, M
    [J]. MULTIVARIATE BEHAVIORAL RESEARCH, 2001, 36 (01) : 53 - 82
  • [27] Mixture formulation through multivariate statistical analysis of process data in property cluster space
    Hada, Subin
    Herring, Robert H., III
    Eden, Mario R.
    [J]. COMPUTERS & CHEMICAL ENGINEERING, 2017, 107 : 26 - 36
  • [28] A linear mixed-effects model for multivariate censored data
    Pan, W
    Louis, TA
    [J]. BIOMETRICS, 2000, 56 (01) : 160 - 166
  • [29] The Application of Spark-Based Gaussian Mixture Model for Farm Environmental Data Analysis
    Pang, Honglin
    Deng, Li
    Wang, Ling
    Fei, Minrui
    [J]. THEORY, METHODOLOGY, TOOLS AND APPLICATIONS FOR MODELING AND SIMULATION OF COMPLEX SYSTEMS, PT III, 2016, 645 : 164 - 173
  • [30] Multivariate data clustering for the Gaussian mixture model
    Kavaliauskas, M
    Rudzkis, R
    [J]. INFORMATICA, 2005, 16 (01) : 61 - 74