Exploring process data with the use of robust outlier detection algorithms

被引:140
|
作者
Chiang, LH [1 ]
Pell, RJ [1 ]
Seasholtz, MB [1 ]
机构
[1] Dow Chem Co USA, Analyt Sci Lab, Midland, MI 48667 USA
关键词
outliers; robust statistics; process data; data preprocessing; scaling methods; TENNESSEE EASTMAN PROBLEM; PRINCIPAL COMPONENTS;
D O I
10.1016/S0959-1524(02)00068-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To implement on-line process monitoring techniques such as principal component analysis (PCA) or partial least squares (PLS), it is necessary to extract data associated with the normal operating conditions from the plant historical database for calibrating the models. One way to do this is to use robust outlier detection algorithms such as resampling by half-means (RHM), smallest half volume (SHV), or ellipsoidal multivariate trimming (MVT) in the off-line model building phase. While RHM and SHV are conceptually clear and statistically sound, the computational requirements are heavy. Closest distance to center (CDC) is proposed in this paper as an alternative for outlier detection. The use of Mahalanobis distance in the initial step of MVT for detecting outliers is known to be ineffective. To improve MVT, CDC is incorporated with MVT. The performance was evaluated relative to the goal of finding the best half of a data set. Data sets were derived from the Tennessee Eastman process (TEP) simulator. Comparable results were obtained for RHM, SHV, and CDC. Better performance was obtained when CDC is incorporated with MVT, compared to using CDC and MVT alone. All robust outlier detection algorithms outperformed the standard PCA algorithm. The effect of auto scaling, robust scaling and a new scaling approach called modified scaling were investigated. With the presence of multiple outliers, auto scaling was found to degrade the performance of all the robust techniques. Reasonable results were obtained with the use of robust scaling and modified scaling. (C) 2003 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:437 / 449
页数:13
相关论文
共 50 条
  • [21] Outlier Detection for Compositional Data Using Robust Methods
    Peter Filzmoser
    Karel Hron
    Mathematical Geosciences, 2008, 40 : 233 - 248
  • [22] ROBUST REGRESSION AND OUTLIER DETECTION FOR NONLINEAR MODELS USING GENETIC ALGORITHMS
    VANKEERBERGHEN, P
    SMEYERSVERBEKE, J
    LEARDI, R
    KARR, CL
    MASSART, DL
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1995, 28 (01) : 73 - 87
  • [23] Use of statistical outlier detection method in adaptive evolutionary algorithms
    Whitacre, James M.
    Pham, Tuan Q.
    Sarker, Ruhul A.
    GECCO 2006: GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, VOL 1 AND 2, 2006, : 1345 - +
  • [24] The Outlier Interval Detection Algorithms on Astronautical Time Series Data
    Hu, Wei
    Bao, Junpeng
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2013, 2013
  • [25] Robust detrending, rereferencing, outlier detection, and inpainting for multichannel data
    de Cheveigne, Alain
    Arzounian, Dorothee
    NEUROIMAGE, 2018, 172 : 903 - 912
  • [26] Robust local outlier detection with statistical parameter for big data
    Lei, Jingsheng
    Jiang, Teng
    Wu, Kui
    Du, Haizhou
    Zhu, Lin
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2015, 30 (05): : 411 - 419
  • [27] Robust principal component analysis and outlier detection with ecological data
    Jackson, DA
    Chen, Y
    ENVIRONMETRICS, 2004, 15 (02) : 129 - 139
  • [28] A pruned support vector data description -based outlier detection method: Applied to robust process monitoring
    Yuan, Ping
    Mao, Zhizhong
    Wang, Biao
    TRANSACTIONS OF THE INSTITUTE OF MEASUREMENT AND CONTROL, 2020, 42 (11) : 2113 - 2126
  • [29] Fuzzy Treatment Method for Outlier Detection in Process Data
    Tanatavikorn, Harakhun
    Yamashita, Yoshiyuki
    JOURNAL OF CHEMICAL ENGINEERING OF JAPAN, 2016, 49 (09) : 864 - 873
  • [30] Algorithms for spatial Outlier detection
    Lu, CT
    Chen, DC
    Kou, WF
    THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2003, : 597 - 600