A Study on Missing Values Imputation using K-Harmonic Means Algorithm: Mixed Datasets

被引:2
|
作者
Anwar, Taufik [1 ]
Siswantining, Titin [1 ]
Sarwinda, Devvi [1 ]
Soemartojo, Saskya Mary [1 ]
Bustamam, Alhadi [1 ]
机构
[1] Univ Indonesia, Fac Math & Nat Sci FMIPA, Dept Math, Depok 16424, Indonesia
关键词
CLUSTERING-ALGORITHM;
D O I
10.1063/1.5141651
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Data cleaning is one step in the preprocessing which in the process often found missing values in the dataset. Missing values is the condition of the absence of data items on a subject. A quick step that can be taken to handle missing values is to remove data containing missing values, but this can reducing information in the data. Another way to handle missing values is by using imputation with mean, median, or mode, and several methods of imputation such as regression, likelihood, and the clustering approach. Imputation with the clustering approach is the focus of this study, where we used the K-Harmonic Means which has been adjusted to handle mixed data. K-Harmonic Means is an extension of K-Means by reducing random centroid initialization sensitivity problems. Imputation of the missing values is carried out by distributing missing values observation to the cluster and replacing the missing values with the information on the same centroid cluster. The results of the simulation were evaluated using the root mean square error and the accuracy values of each imputation value for numerical and categorical data respectively.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] K-Harmonic means type clustering algorithm for mixed datasets
    Ahmad, Amir
    Hashmi, Sarosh
    [J]. APPLIED SOFT COMPUTING, 2016, 48 : 39 - 49
  • [2] Ant clustering algorithm with K-harmonic means clustering
    Jiang, Hua
    Yi, Shenghe
    Li, Jing
    Yang, Fengqin
    Hu, Xin
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (12) : 8679 - 8684
  • [3] A hybrid fuzzy K-harmonic means clustering algorithm
    Wu, Xiaohong
    Wu, Bin
    Sun, Jun
    Qiu, Shengwei
    Li, Xiang
    [J]. APPLIED MATHEMATICAL MODELLING, 2015, 39 (12) : 3398 - 3409
  • [4] Adaptive K-Harmonic Means Clustering Algorithm for VANETs
    Chai, Rong
    Ge, Xianlei
    Chen, Qianbin
    [J]. 2014 14TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES (ISCIT), 2014, : 233 - 237
  • [5] Some Notes on K-Harmonic Means Clustering Algorithm
    Zhi, Xiao-bin
    Fan, Jiu-lun
    [J]. QUANTITATIVE LOGIC AND SOFT COMPUTING 2010, VOL 2, 2010, 82 : 375 - +
  • [6] K-Harmonic Means Data Clustering with PSO Algorithm
    Nie, Fangyan
    Tu, Tianyi
    Pan, Meisen
    Rong, Qiusheng
    Zhou, Huican
    [J]. ADVANCES IN ELECTRICAL ENGINEERING AND AUTOMATION, 2012, 139 : 67 - 73
  • [7] Imputation of missing values in lipidomic datasets
    Froelich, Nicolas
    Klose, Christian
    Widen, Elisabeth
    Ripatti, Samuli
    Gerl, Mathias J.
    [J]. PROTEOMICS, 2024, 24 (15)
  • [8] K-HARMONIC MEANS DATA CLUSTERING WITH IMPERIALIST COMPETITIVE ALGORITHM
    Emami, Hojjat
    Dami, Sina
    Shirazi, Hossein
    [J]. UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2015, 77 (01): : 91 - 104
  • [9] K-harmonic means clustering algorithm using feature weighting for color image segmentation
    Zhou, Zhiping
    Zhao, Xiaoxiao
    Zhu, Shuwei
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (12) : 15139 - 15160
  • [10] K-harmonic means clustering algorithm using feature weighting for color image segmentation
    Zhiping Zhou
    Xiaoxiao Zhao
    Shuwei Zhu
    [J]. Multimedia Tools and Applications, 2018, 77 : 15139 - 15160