Robust probabilistic PCA with missing data and contribution analysis for outlier detection

被引:95
|
作者
Chen, Tao [1 ]
Martin, Elaine [2 ]
Montague, Gary [2 ]
机构
[1] Nanyang Technol Univ, Sch Chem & Biomed Engn, Singapore 637459, Singapore
[2] Univ Newcastle, Sch Chem Engn & Adv Mat, Newcastle Upon Tyne NE1 7RU, Tyne & Wear, England
关键词
PRINCIPAL COMPONENTS; COVARIANCE; IDENTIFICATION; MATRIX;
D O I
10.1016/j.csda.2009.03.014
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Principal component analysis (PCA) is a widely adopted multivariate data analysis technique, with interpretation being established on the basis of both classical linear projection and a probability model (i.e. probabilistic PCA (PPCA)). Recently robust PPCA models, by using the multivariate t-distribution, have been proposed to consider the situation where there may be outliers within the data set. This paper presents an overview of the robust PPCA technique, and further discusses the issue of missing data. An expectation-maximization (EM) algorithm is presented for the maximum likelihood estimation of the model parameters in the presence of missing data. When applying robust PPCA for outlier detection, a contribution analysis method is proposed to identify which variables contribute the most to the occurrence of outliers, providing valuable information regarding the source of outlying data. The proposed technique is demonstrated on numerical examples, and the application to outlier detection and diagnosis in an industrial fermentation process. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:3706 / 3716
页数:11
相关论文
共 50 条
  • [21] Outlier detection by robust principal components analysis
    Caroni, C
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2000, 29 (01) : 139 - 151
  • [22] Deep probabilistic graphical modeling for robust multivariate time series anomaly detection with missing data
    Yang, Jingyu
    Yue, Zuogong
    Yuan, Ye
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2023, 238
  • [23] Robust Multivariate Outlier Detection Methods for Environmental Data
    Alameddine, Ibrahim
    Kenney, Melissa A.
    Gosnell, Russell J.
    Reckhow, Kenneth H.
    JOURNAL OF ENVIRONMENTAL ENGINEERING-ASCE, 2010, 136 (11): : 1299 - 1304
  • [24] ROBUST ESTIMATES, RESIDUALS, AND OUTLIER DETECTION WITH MULTIRESPONSE DATA
    GNANADESIKAN, R
    KETTENRING, JR
    BIOMETRICS, 1972, 28 (01) : 81 - +
  • [25] Outlier detection for compositional data using robust methods
    Filzmoser, Peter
    Hron, Karel
    MATHEMATICAL GEOSCIENCES, 2008, 40 (03) : 233 - 248
  • [26] Robust Outlier Detection Method For Multivariate Spatial Data
    Sweta Shukla
    S. Lalitha
    National Academy Science Letters, 2021, 44 : 551 - 554
  • [27] Robust Outlier Detection Method For Multivariate Spatial Data
    Shukla, Sweta
    Lalitha, S.
    NATIONAL ACADEMY SCIENCE LETTERS-INDIA, 2021, 44 (06): : 551 - 554
  • [28] Outlier Detection for Compositional Data Using Robust Methods
    Peter Filzmoser
    Karel Hron
    Mathematical Geosciences, 2008, 40 : 233 - 248
  • [29] Outlier Detection over Sliding Windows for Probabilistic Data Streams
    Bin Wang
    Xiao-Chun Yang
    Guo-Ren Wang
    Ge Yu
    Journal of Computer Science and Technology, 2010, 25 : 389 - 400
  • [30] Outlier Detection over Sliding Windows for Probabilistic Data Streams
    Wang, Bin
    Yang, Xiao-Chun
    Wang, Guo-Ren
    Yu, Ge
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2010, 25 (03) : 389 - 400