Missing value imputation method based on correlation analysis and Gaussian mixture model

被引:0
|
作者
Zhang, Jie [1 ]
Chang, Yuqing [1 ]
Wang, Ran [2 ]
Wang, Fuli [1 ]
机构
[1] Northeastern Univ, Sch Informat Sci & Engn, Shenyang 110819, Liaoning, Peoples R China
[2] North China Elect Power Univ, Dept Automat, Baoding, Hebei, Peoples R China
基金
中国国家自然科学基金;
关键词
Missing value imputation; correlation analysis; Gaussian mixture model; normality assessment; data grouping; TIME-SERIES;
D O I
10.1177/01423312241284660
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article proffers a novel procedure for missing value imputation, combining correlation analysis and Gaussian mixture model (GMM). Firstly, the normality of the data is assessed using normality assessment algorithm, and then the appropriate correlation coefficient calculation approach is selected owing to the normality of the data. Subsequently, the original correlation matrix is transformed into a binarized matrix based on a chosen threshold, which is used to group variables into different categories according to the correlation among them. Different missing value imputation methods are applied to these categories: mean imputation for single-variable groups and GMM-driven imputation for multi-variable groups. For multi-variable groups, a GMM model is trained using the Figueiredo-Jain algorithm, after which missing values are imputed using the mean derived from the model. Ultimately, the experimental evidence from Tennessee Eastman process and gold hydrometallurgy process further verify the superiority of the proposed algorithm.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Missing Value Imputation Method for Multiclass Matrix Data Based on Closed Itemset
    Tada, Mayu
    Suzuki, Natsumi
    Okada, Yoshifumi
    ENTROPY, 2022, 24 (02)
  • [22] Outlier Removal in Model-Based Missing Value Imputation for Medical Datasets
    Huang, Min-Wei
    Lin, Wei-Chao
    Tsai, Chih-Fong
    JOURNAL OF HEALTHCARE ENGINEERING, 2018, 2018
  • [23] The Comparison of Missing Value Imputation for Price Index Forecasting Based on ARIMA Model
    Suwannawach, Piyapan
    Jakor, Kunnika
    Teeravech, Narongrit
    Sitikornchayarpong, Chinapratha
    2024 21ST INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY, ECTI-CON 2024, 2024,
  • [24] Semiparametric Fractional Imputation Using Gaussian Mixture Models for Handling Multivariate Missing Data
    Sang, Hejian
    Kim, Jae Kwang
    Lee, Danhyang
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2022, 117 (538) : 654 - 663
  • [25] A Missing Data Imputation Method Based on Cluster and Spatial Autoregresive Model
    Yang Zhaohui
    Yu Jie
    Chen Jiangping
    EPLWW3S 2011: 2011 INTERNATIONAL CONFERENCE ON ECOLOGICAL PROTECTION OF LAKES-WETLANDS-WATERSHED AND APPLICATION OF 3S TECHNOLOGY, VOL 2, 2011, : 538 - 541
  • [26] VALUE FUNCTION ESTIMATION BASED ON AN ERROR GAUSSIAN MIXTURE MODEL
    Cui, Delong
    Peng, Zhiping
    Li, Qirui
    He, Jieguang
    Li, Kaibin
    Hung, Shangchao
    JOURNAL OF NONLINEAR AND CONVEX ANALYSIS, 2021, 22 (09) : 1687 - 1702
  • [27] A label ranking method based on Gaussian mixture model
    Zhou, Yangming
    Liu, Yangguang
    Gao, Xiao-Zhi
    Qiu, Guoping
    KNOWLEDGE-BASED SYSTEMS, 2014, 72 : 108 - 113
  • [28] Iterative missing value imputation based on feature importance
    Guo, Cong
    Yang, Wei
    Liu, Chun
    Li, Zheng
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (10) : 6387 - 6414
  • [29] A class center based approach for missing value imputation
    Tsai, Chih-Fong
    Li, Miao-Ling
    Lin, Wei-Chao
    KNOWLEDGE-BASED SYSTEMS, 2018, 151 : 124 - 135
  • [30] DBSCANI: Noise-Resistant Method for Missing Value Imputation
    Purwar, Archana
    Singh, Sandeep Kumar
    JOURNAL OF INTELLIGENT SYSTEMS, 2016, 25 (03) : 431 - 440