Treatment of missing values in process data analysis

被引:62
|
作者
Imtiaz, S. A. [1 ]
Shah, S. L. [1 ]
机构
[1] Univ Alberta, Dept Chem & Mat Engn, Edmonton, AB T6G 2G6, Canada
来源
关键词
missing data; PCA; batch process; compression; multi-rate data;
D O I
10.1002/cjce.20099
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
Process data suffer from many different types of imperfections. For example, bad data due to sensor problems, multi-rate sampling, outliers, compressed data etc. Since most modelling and data analysis methods are developed to analyze regularly sampled and well conditioned data sets there is a need for pre-treatment of data. Traditionally data conditioning or pre-treatment has been done without taking into account the end use of the data, for example, univariate methods have been used to interpolate bad data even when the intended end use of data is for multivariate analysis. In this paper we consider the pre-treatment and data analysis as a collective problem and propose data conditioning methods in a multivariate framework. We first review classical process data analysis methods and acclaimed missing data handling techniques used in statistical surveys and biostatistics. The applications of these acclaimed missing data techniques are demonstrated in three different instances: (i) principal components analysis (PCA) is extended in data augmentation (DA) framework for dealing with missing values, (ii) iterative missing data technique is used to synchronize uneven length batch process data, and (iii) PCA based iterative missing data technique is used to restore the correlation structure of compressed data.
引用
收藏
页码:838 / 858
页数:21
相关论文
共 50 条
  • [1] Treatment of missing values with imputation for the analysis of otologic data
    Laurikkala, J
    Kentala, E
    Juhola, M
    Pyykkö, I
    [J]. MEDICAL INFORMATICS EUROPE '99, 1999, 68 : 428 - 431
  • [2] ANALYSIS OF DATA WITH MISSING VALUES - COMMENTARY
    LITTLE, RJA
    [J]. STATISTICS IN MEDICINE, 1988, 7 (1-2) : 347 - 355
  • [3] ANALYSIS OF DATA WITH MISSING VALUES - DISCUSSION
    HELMS, RW
    LAIRD, NM
    LEBOWITZ, MD
    MANTEL, N
    LOUIS, TA
    WU, M
    [J]. STATISTICS IN MEDICINE, 1988, 7 (1-2) : 357 - 360
  • [4] ESTIMATION OF MISSING VALUES FOR THE ANALYSIS OF INCOMPLETE DATA
    WILKINSON, GN
    [J]. BIOMETRICS, 1958, 14 (02) : 257 - 286
  • [5] A Matrix Completion Method for Imputing Missing Values of Process Data
    Zhang, Xinyu
    Sun, Xiaoyan
    Xia, Li
    Tao, Shaohui
    Xiang, Shuguang
    [J]. PROCESSES, 2024, 12 (04)
  • [6] Principal Component Analysis of Process Datasets with Missing Values
    Severson, Kristen A.
    Molaro, Mark C.
    Braatz, Richard D.
    [J]. PROCESSES, 2017, 5 (03)
  • [7] MISSING VALUES IN DATA
    RACKLEY, K
    [J]. SIAM REVIEW, 1974, 16 (01) : 136 - 136
  • [8] Treatment of missing values for multivariate statistical analysis of gel-based proteomics data
    Pedreschi, Romina
    Hertog, Maarten L. A. T. M.
    Carpentier, Sebastien C.
    Lammertyn, Jeroen
    Robben, Johan
    Noben, Jean-Paul
    Panis, Bart
    Swennen, Rony
    Nicolai, Bart M.
    [J]. PROTEOMICS, 2008, 8 (07) : 1371 - 1383
  • [9] THE ANALYSIS OF SOCIAL-SCIENCE DATA WITH MISSING VALUES
    LITTLE, RJA
    RUBIN, DB
    [J]. SOCIOLOGICAL METHODS & RESEARCH, 1989, 18 (2-3) : 292 - 326
  • [10] Robust Principal Component Analysis of Data with Missing Values
    Karkkainen, Tommi
    Saarela, Mirka
    [J]. MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, MLDM 2015, 2015, 9166 : 140 - 154