Practical approaches to principal component analysis for simultaneously dealing with missing and censored elements in chemical data

被引:7
|
作者
Stanimirova, I. [1 ]
机构
[1] Silesian Univ, Inst Chem, Dept Theoret Chem, PL-40006 Katowice, Poland
关键词
Left-censored data; Generalized nonlinear iterative partial least squares algorithm; Maximum likelihood principal component analysis; Expectation-maximization algorithm; Positive matrix factorization; POSITIVE MATRIX FACTORIZATION; MULTIVARIATE CURVE RESOLUTION; MAXIMUM-LIKELIHOOD; DATA SETS; DETECTION LIMIT; INCOMPLETE DATA; OUTLIERS; VALUES; NONDETECTS; REGRESSION;
D O I
10.1016/j.aca.2013.08.026
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Multivariate chemical data often contain elements that are missing completely at random and the so-called left-censored elements whose values are only known to be below a definite threshold value (reporting limit). In the last several years, attention has been paid to developing methods for dealing with data containing missing elements and those that can handle data with missing elements and outliers. However, processing data with both missing and left-censored elements is still an ongoing problem. The aim of this work was to investigate which method is most suitable for handling left-censored and missing completely at random elements that are present simultaneously in chemical data by using a comparison of the generalized nonlinear iterative partial least squares (NIPALS1) algorithm that has been recently proposed, methods that include uncertainty information like maximum likelihood principal component analysis, MLPCA2, and replacement methods. The results of the Monte Carlo simulation study for artificial and real data sets showed that substitution with half of the reporting limit can be used when the percentage of left-censored elements per variable is up to 30-40%. The generalized NIPALS algorithm is generally recommended for a large percentage of left-censored elements per variable and particularly when a large number of variables are censored. The expectation-maximization approach applied to data with censored elements substituted with half of the reporting limits can be a strategy for dealing with missing and left-censored elements in data, but if the converge criterion is not fulfilled, then the generalized NIPALS algorithm can be applied. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:27 / 37
页数:11
相关论文
共 50 条
  • [21] Approaches for dealing with missing data in health care studies
    Penny, Kay I.
    Atkinson, Ian
    JOURNAL OF CLINICAL NURSING, 2012, 21 (19-20) : 2722 - 2729
  • [22] Approaches to Dealing With Missing Data in Railway Asset Management
    McMahon, Paul
    Zhang, Tieling
    Dwight, Richard A.
    IEEE ACCESS, 2020, 8 : 48177 - 48194
  • [23] Chemometric exploration of sea water chemical component data sets with missing elements
    Smolinski, Adam
    Falkowska, Lucyna
    Pryputniewicz, Dorota
    OCEANOLOGICAL AND HYDROBIOLOGICAL STUDIES, 2008, 37 (03) : 49 - 62
  • [24] Handling missing values in Principal Component Analysis
    Josse, Julie
    Husson, Francois
    Pages, Jerome
    JOURNAL OF THE SFDS, 2009, 150 (02): : 28 - 51
  • [25] Dynamic principal component analysis with missing values
    Kwon, Junhyeon
    Oh, Hee-Seok
    Lim, Yaeji
    JOURNAL OF APPLIED STATISTICS, 2020, 47 (11) : 1957 - 1969
  • [26] SPARSE PRINCIPAL COMPONENT ANALYSIS WITH MISSING OBSERVATIONS
    Park, Seyoung
    Zhao, Hongyu
    ANNALS OF APPLIED STATISTICS, 2019, 13 (02): : 1016 - 1042
  • [27] Solving the Missing Data Problem in Urban Traffic Estimation with Principal Component Analysis
    Yang, Qiangrong
    Hu, Jianyao
    Peng, Qi
    BDIOT 2018: PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON BIG DATA AND INTERNET OF THINGS, 2018, : 23 - 28
  • [28] PRINCIPAL COMPONENT ANALYSIS WITH MISSING DATA AND ITS APPLICATION TO POLYHEDRAL OBJECT MODELING
    SHUM, HY
    IKEUCHI, K
    REDDY, R
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1995, 17 (09) : 854 - 867
  • [29] Principal component analysis with missing data and its application to polyhedral object modeling
    Shum, Heung-Yeung, 1600, IEEE, Los Alamitos, CA, United States (17):
  • [30] A novel algorithm for complete ranking of DMUs dealing with negative data using Data Envelopment Analysis and Principal Component Analysis: Pharmaceutical companies and another practical example
    Yazdi, Hoda Dalili
    Movahedi Sobhani, Farzad
    Lotfi, Farhad Hosseinzadeh
    Kazemipoor, Hamed
    PLOS ONE, 2023, 18 (09):