Robust Estimation of Gaussian Mixture Models Using Anomaly Scores and Bayesian Information Criterion for Missing Value Imputation

被引:0
|
作者
Mouret, F. [1 ,2 ]
Albughdadi, M. [1 ]
Duthoit, S. [1 ]
Kouame, D. [3 ]
Tourneret, J-Y [2 ]
机构
[1] TerraNIS, 12 Ave Europe, F-31520 Ramonville St Agne, France
[2] Univ Toulouse, IRIT ENSEEIHT TeSA, 2 Rue Charles Camichel, F-31000 Toulouse, France
[3] Univ Toulouse, IRIT UPS, 118 Route Narbonne, F-31062 Toulouse 9, France
关键词
Imputation; Anomaly Detection; Gaussian Mixture Model; Robust estimation; Isolation Forest; One-Class SVM;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The Expectation-Maximization algorithm is a very popular approach for estimating the parameters of Gaussian mixture models (GMMs). A known issue with GMM estimation is its sensitivity to outliers, which can lead to poor estimation performance depending on the dataset under consideration. A common approach to deal with this issue is robust estimation, which typically consists of reducing the influence of the outliers on the estimators by weighting the impact of some samples of the dataset considered as outliers. In an unsupervised context, it is difficult to know which sample from the database corresponds to a normal observation. To that extent, we propose to use within the EM algorithm an outlier detection step that attributes an anomaly score to each sample of the database in an unsupervised way. A modified Bayesian Information Criterion is also introduced to efficiently select the appropriate amount of outliers contained in a dataset. The proposed method is tested on a benchmark remote sensing dataset coming from the UCI Machine Learning Repository. The experimental results show the interest of the proposed robustification when compared to other benchmark imputation procedures.
引用
收藏
页码:827 / 831
页数:5
相关论文
共 50 条
  • [1] Estimation of missing LSF parameters using Gaussian Mixture Models
    Martin, R
    Hoelper, C
    Wittke, I
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 729 - 732
  • [2] Semiparametric Fractional Imputation Using Gaussian Mixture Models for Handling Multivariate Missing Data
    Sang, Hejian
    Kim, Jae Kwang
    Lee, Danhyang
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2022, 117 (538) : 654 - 663
  • [3] Missing Value Imputation Based on Gaussian Mixture Model for the Internet of Things
    Yan, Xiaobo
    Xiong, Weiqing
    Hu, Liang
    Wang, Feng
    Zhao, Kuo
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [4] IMPROVING BAYESIAN MIXTURE MODELS FOR MULTIPLE IMPUTATION OF MISSING DATA USING FOCUSED CLUSTERING
    Wei, Lan
    Reiter, Jerome P.
    REVSTAT-STATISTICAL JOURNAL, 2018, 16 (02) : 213 - 230
  • [5] Missing value imputation method based on correlation analysis and Gaussian mixture model
    Zhang, Jie
    Chang, Yuqing
    Wang, Ran
    Wang, Fuli
    TRANSACTIONS OF THE INSTITUTE OF MEASUREMENT AND CONTROL, 2024,
  • [6] Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data
    Sehgal, MSB
    Gondal, I
    Dooley, LS
    BIOINFORMATICS, 2005, 21 (10) : 2417 - 2423
  • [7] Comparison of speaker segmentation methods based on the Bayesian information criterion and adapted Gaussian mixture models
    Grasic, Matej
    Kos, Marko
    Zgank, Andrej
    Kacic, Zdravko
    PROCEEDINGS OF IWSSIP 2008: 15TH INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING, 2008, : 161 - 164
  • [8] Multivariate data imputation using Gaussian mixture models
    Silva, Diogo S. F.
    Deutsch, Clayton, V
    SPATIAL STATISTICS, 2018, 27 : 74 - 90
  • [9] Performance of Akaike Information Criterion and Bayesian Information Criterion in Selecting Partition Models and Mixture Models
    Liu, Qin
    Charleston, Michael A.
    Richards, Shane A.
    Holland, Barbara R.
    SYSTEMATIC BIOLOGY, 2023, 72 (01) : 92 - 105
  • [10] A Review On Missing Value Estimation Using Imputation Algorithm
    Armina, Roslan
    Zain, Azlan Mohd
    Ali, Nor Azizah
    Sallehuddin, Roselina
    6TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL MATHEMATICS (ICCSCM 2017), 2017, 892