Missing value imputation using unsupervised machine learning techniques

被引:46
|
作者
Raja, P. S. [1 ]
Thangavel, K. [1 ]
机构
[1] Periyar Univ, Dept Comp Sci, Salem, Tamil Nadu, India
关键词
K-means; Fuzzy C-means; Rough K-means; Machine learning; Missing values; Imputation; ALGORITHMS; SET;
D O I
10.1007/s00500-019-04199-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In data mining, preprocessing is one of the essential processes which involves data normalization, noise removal, handling missing values, etc. This paper focuses on handling missing values using unsupervised machine learning techniques. Soft computation approaches are combined with the clustering techniques to form a novel method to handle the missing values, which help us to overcome the problems of inconsistency. Rough K-means centroid-based imputation method is proposed and compared with K-means centroid-based imputation method, fuzzy C-means centroid-based imputation method, K-means parameter-based imputation method, fuzzy C-means parameter-based imputation method, and rough K-means parameter-based imputation methods. The experimental analysis is carried out on four benchmark datasets, viz. Dermatology, Pima, Wisconsin, and Yeast datasets, which have taken from UCI data repository. The proposed method proves the efficacy of different datasets, and the results are also promising one.
引用
收藏
页码:4361 / 4392
页数:32
相关论文
共 50 条
  • [41] Missing Value Imputation in Stature Estimation by Learning Algorithms Using Anthropometric Data: A Comparative Study
    Son, Youngdoo
    Kim, Wonjoon
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (14):
  • [42] Graph Machine Learning for Improved Imputation of Missing Tropospheric Ozone Data
    Betancourt, Clara
    Li, Cathy W. Y.
    Kleinert, Felix
    Schultz, Martin G.
    [J]. ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2023, 57 (46) : 18246 - 18258
  • [43] A BOOTSTRAP METHOD FOR USING IMPUTATION TECHNIQUES FOR DATA WITH MISSING VALUES
    BELLO, AL
    [J]. BIOMETRICAL JOURNAL, 1994, 36 (04) : 453 - 464
  • [44] Locally linear reconstruction based missing value imputation for supervised learning
    Kang, Pilsung
    [J]. NEUROCOMPUTING, 2013, 118 : 65 - 78
  • [45] MULTIPLE IMPUTATION AS A MISSING DATA MACHINE
    BRAND, J
    VANBUUREN, S
    VANMULLIGEN, EM
    TIMMERS, T
    GELSEMA, E
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 1994, : 303 - 306
  • [46] Missing value imputation in a data matrix using the regularised singular value decomposition
    Arciniegas-Alarcon, Sergio
    Garcia-Pena, Marisol
    Krzanowski, Wojtek J.
    Rengifo, Camilo
    [J]. METHODSX, 2023, 11
  • [47] Insights into wheat science: A bibliometric review using unsupervised machine learning techniques
    Perez-Perez, Martin
    Ribeiro, Miguel
    Fdez-Riverola, Florentino
    Igrejas, Gilberto
    [J]. JOURNAL OF CEREAL SCIENCE, 2024, 118
  • [48] Gaussian processes for missing value imputation
    Jafrasteh, Bahram
    Hernandez-Lobato, Daniel
    Lubian-Lopez, Simon Pedro
    Benavente-Fernandez, Isabel
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 273
  • [49] Panic Behavior Detection using Unsupervised Machine Learning Techniques: A comparative study
    Shehab, Doaa
    Ammar, Heyfa
    Cherif, Asma
    [J]. 2021 IEEE ASIA-PACIFIC CONFERENCE ON COMPUTER SCIENCE AND DATA ENGINEERING (CSDE), 2021,
  • [50] Missing value imputation for epistatic MAPs
    Ryan, Colm
    Greene, Derek
    Cagney, Gerard
    Cunningham, Padraig
    [J]. BMC BIOINFORMATICS, 2010, 11