Robust probabilistic PCA with missing data and contribution analysis for outlier detection

被引:95
|
作者
Chen, Tao [1 ]
Martin, Elaine [2 ]
Montague, Gary [2 ]
机构
[1] Nanyang Technol Univ, Sch Chem & Biomed Engn, Singapore 637459, Singapore
[2] Univ Newcastle, Sch Chem Engn & Adv Mat, Newcastle Upon Tyne NE1 7RU, Tyne & Wear, England
关键词
PRINCIPAL COMPONENTS; COVARIANCE; IDENTIFICATION; MATRIX;
D O I
10.1016/j.csda.2009.03.014
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Principal component analysis (PCA) is a widely adopted multivariate data analysis technique, with interpretation being established on the basis of both classical linear projection and a probability model (i.e. probabilistic PCA (PPCA)). Recently robust PPCA models, by using the multivariate t-distribution, have been proposed to consider the situation where there may be outliers within the data set. This paper presents an overview of the robust PPCA technique, and further discusses the issue of missing data. An expectation-maximization (EM) algorithm is presented for the maximum likelihood estimation of the model parameters in the presence of missing data. When applying robust PPCA for outlier detection, a contribution analysis method is proposed to identify which variables contribute the most to the occurrence of outliers, providing valuable information regarding the source of outlying data. The proposed technique is demonstrated on numerical examples, and the application to outlier detection and diagnosis in an industrial fermentation process. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:3706 / 3716
页数:11
相关论文
共 50 条
  • [1] Islanding Detection Based on Probabilistic PCA with Missing Values in PMU Data
    Liu, Xueqin
    Laverty, David
    Best, Robert
    2014 IEEE PES GENERAL MEETING - CONFERENCE & EXPOSITION, 2014,
  • [2] Robust PCA for skewed data and its outlier map
    Hubert, Mia
    Rousseeuw, Peter
    Verdonck, Tim
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2009, 53 (06) : 2264 - 2274
  • [3] ROBUST PCA METHODS FOR COMPLETE AND MISSING DATA
    Karhunen, Juha
    NEURAL NETWORK WORLD, 2011, 21 (05) : 357 - 392
  • [4] Outlier Detection and Robust PCA Using a Convex Measure of Innovation
    Rahmani, Mostafa
    Li, Ping
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [5] Automated outlier detection and estimation of missing data
    Rhyu, Jinwook
    Bozinovski, Dragana
    Dubs, Alexis B.
    Mohan, Naresh
    Bende, Elizabeth M. Cummings
    Maloney, Andrew J.
    Nieves, Miriam
    Sangerman, Jose
    Lu, Amos E.
    Hong, Moo Sun
    Artamonova, Anastasia
    Ou, Rui Wen
    Barone, Paul W.
    Leung, James C.
    Wolfrum, Jacqueline M.
    Sinskey, Anthony J.
    Springs, Stacy L.
    Braatz, Richard D.
    COMPUTERS & CHEMICAL ENGINEERING, 2024, 180
  • [6] Robust principal component analysis and outlier detection with ecological data
    Jackson, DA
    Chen, Y
    ENVIRONMETRICS, 2004, 15 (02) : 129 - 139
  • [7] A New Approach of Outlier-robust Missing Value Imputation for Metabolomics Data Analysis
    Kumar, Nishith
    Hoque, Md Aminul
    Shahjaman, Md
    Islam, S. M. Shahinul
    Mollah, Md Nurul Haque
    CURRENT BIOINFORMATICS, 2019, 14 (01) : 43 - 52
  • [8] OUTLIER ROBUST POSTERIOR PREDICTIVE CHECKS FOR MISSING DATA MODELS
    Nicorici, Galina
    ECONOMIC COMPUTATION AND ECONOMIC CYBERNETICS STUDIES AND RESEARCH, 2014, 48 (01): : 233 - 246
  • [9] Outlier-Robust Tensor PCA
    Zhou, Pan
    Feng, Jiashi
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3938 - 3946
  • [10] Robust PCA via Outlier Pursuit
    Xu, Huan
    Caramanis, Constantine
    Sanghavi, Sujay
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2012, 58 (05) : 3047 - 3064