Exploring process data with the use of robust outlier detection algorithms

被引:140
|
作者
Chiang, LH [1 ]
Pell, RJ [1 ]
Seasholtz, MB [1 ]
机构
[1] Dow Chem Co USA, Analyt Sci Lab, Midland, MI 48667 USA
关键词
outliers; robust statistics; process data; data preprocessing; scaling methods; TENNESSEE EASTMAN PROBLEM; PRINCIPAL COMPONENTS;
D O I
10.1016/S0959-1524(02)00068-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To implement on-line process monitoring techniques such as principal component analysis (PCA) or partial least squares (PLS), it is necessary to extract data associated with the normal operating conditions from the plant historical database for calibrating the models. One way to do this is to use robust outlier detection algorithms such as resampling by half-means (RHM), smallest half volume (SHV), or ellipsoidal multivariate trimming (MVT) in the off-line model building phase. While RHM and SHV are conceptually clear and statistically sound, the computational requirements are heavy. Closest distance to center (CDC) is proposed in this paper as an alternative for outlier detection. The use of Mahalanobis distance in the initial step of MVT for detecting outliers is known to be ineffective. To improve MVT, CDC is incorporated with MVT. The performance was evaluated relative to the goal of finding the best half of a data set. Data sets were derived from the Tennessee Eastman process (TEP) simulator. Comparable results were obtained for RHM, SHV, and CDC. Better performance was obtained when CDC is incorporated with MVT, compared to using CDC and MVT alone. All robust outlier detection algorithms outperformed the standard PCA algorithm. The effect of auto scaling, robust scaling and a new scaling approach called modified scaling were investigated. With the presence of multiple outliers, auto scaling was found to degrade the performance of all the robust techniques. Reasonable results were obtained with the use of robust scaling and modified scaling. (C) 2003 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:437 / 449
页数:13
相关论文
共 50 条
  • [41] Robust probabilistic PCA with missing data and contribution analysis for outlier detection
    Chen, Tao
    Martin, Elaine
    Montague, Gary
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2009, 53 (10) : 3706 - 3716
  • [42] Outlier detection by means of robust regression estimators for use in engineering science
    Hekimoglu, Serif
    Erenoglu, R. Cuneyt
    Kalina, Jan
    JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE A, 2009, 10 (06): : 909 - 921
  • [43] Outlier Detection for Control Process Data Based on Improved ARHMM
    Liu, Fang
    Su, Weixing
    Zhao, Jianjun
    Chen, Hanning
    WIRELESS PERSONAL COMMUNICATIONS, 2018, 103 (01) : 11 - 24
  • [44] Unsupervised approach for online outlier detection in industrial process data
    Bechny, Michal
    Himmelbauer, Johannes
    3RD INTERNATIONAL CONFERENCE ON INDUSTRY 4.0 AND SMART MANUFACTURING, 2022, 200 : 257 - 266
  • [45] Outlier Robust Gaussian Process Classification
    Kim, Hyun-Chul
    Ghahramani, Zoubin
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2008, 5342 : 896 - +
  • [46] Outlier Detection for Control Process Data Based on Improved ARHMM
    Fang Liu
    Weixing Su
    Jianjun Zhao
    Hanning Chen
    Wireless Personal Communications, 2018, 103 : 11 - 24
  • [47] Improving Data Reliability for Process Monitoring with Fuzzy Outlier Detection
    Tanatavikorn, Harakhun
    Yamashita, Yoshiyuki
    12TH INTERNATIONAL SYMPOSIUM ON PROCESS SYSTEMS ENGINEERING (PSE) AND 25TH EUROPEAN SYMPOSIUM ON COMPUTER AIDED PROCESS ENGINEERING (ESCAPE), PT B, 2015, 37 : 1595 - 1600
  • [48] Outlier Detection Algorithms for Open Environments
    Kou A.
    Huang X.
    Sun W.
    Wireless Communications and Mobile Computing, 2023, 2023
  • [49] A Comparative Study of Outlier Detection Algorithms
    Isaksson, Charlie
    Dunham, Margaret H.
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, 2009, 5632 : 440 - 453
  • [50] Robust Functional Regression for Outlier Detection
    Hullait, Harjit
    Leslie, David S.
    Pavlidis, Nicos G.
    King, Steve
    ADVANCED ANALYTICS AND LEARNING ON TEMPORAL DATA, AALTD 2019, 2020, 11986 : 3 - 13