A Novel Index Measure Imputation Algorithm for Missing Data Values: A Machine Learning Approach

被引:0
|
作者
Madhu, G. [1 ]
Rajinikanth, T. V. [2 ]
机构
[1] VNR VJIET, Dept Informat Technol, Hyderabad 500090, Andhra Pradesh, India
[2] GRIET, Dept Informat Technol, Hyderabad 500085, Andhra Pradesh, India
关键词
classification; decision tree; index measure; missing values;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem of missing data in the real world datasets has very significant role in the real time data mining process and becomes more complex in large databases. The presence of missing values influences data set features and the class attributes, thus affecting the predictive accuracies of the classifiers. For the last one decade, many researchers have come out with different techniques for dealing with missing attribute values in databases with homogeneous and/or numeric attributes. In this research work, we proposed a new indexing measure to the imputation algorithm for missing data values of the attributes to compute the similarity measure between any two typical elements in the dataset. It can also be applied on any dataset be it a nominal and/or real. The proposed algorithm is evaluated by extensive experiments and comparison with KNNI, SVMI, WKNNI, KMI and FKMI algorithms. The results showed that the proposed algorithm has better performance than the existing imputation algorithms in terms of classification accuracy and also our decision tree algorithm employs highly accurate decision rules.
引用
收藏
页码:81 / 87
页数:7
相关论文
共 50 条
  • [41] Active learning with missing values considering imputation uncertainty
    Han, Jongmin
    Kang, Seokho
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 224
  • [42] Novel Missing-Rate-Oriented Selective Algorithm for Handling Missing Data by Minimizing Imputation
    Li, Xing
    Li, Guolin
    Fishbune, Rick
    [J]. 2016 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY PROCEEDINGS - CYBERC 2016, 2016, : 234 - 237
  • [43] Semi-supervised learning with missing values imputation
    Huang, Buliao
    Zhu, Yunhui
    Usman, Muhammad
    Chen, Huanhuan
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 284
  • [44] A Nonparametric Multiple Imputation Approach for Data with Missing Covariate Values with Application to Colorectal Adenoma Data
    Hsu, Chiu-Hsieh
    Long, Qi
    Li, Yisheng
    Jacobs, Elizabeth
    [J]. JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2014, 24 (03) : 634 - 648
  • [45] Treatment of missing values with imputation for the analysis of otologic data
    Laurikkala, J
    Kentala, E
    Juhola, M
    Pyykkö, I
    [J]. MEDICAL INFORMATICS EUROPE '99, 1999, 68 : 428 - 431
  • [46] REGRESSION IMPUTATION OF MISSING VALUES IN LONGITUDINAL DATA SETS
    SCHNEIDERMAN, ED
    KOWALSKI, CJ
    WILLIS, SM
    [J]. INTERNATIONAL JOURNAL OF BIO-MEDICAL COMPUTING, 1993, 32 (02): : 121 - 133
  • [47] Robust imputation method for missing values in microarray data
    Yoon, Dankyu
    Lee, Eun-Kyung
    Park, Taesung
    [J]. BMC BIOINFORMATICS, 2007, 8 (Suppl 2)
  • [48] Robust imputation method for missing values in microarray data
    Dankyu Yoon
    Eun-Kyung Lee
    Taesung Park
    [J]. BMC Bioinformatics, 8
  • [49] Imputation of missing values in multi-view data
    van Loon, Wouter
    Fokkema, Marjolein
    de Vos, Frank
    Koini, Marisa
    Schmidt, Reinhold
    de Rooij, Mark
    [J]. Information Fusion, 2024, 111
  • [50] A genetic algorithm for multivariate missing data imputation
    Carlos Figueroa-Garcia, Juan
    Neruda, Roman
    Hernandez-Perez, German
    [J]. INFORMATION SCIENCES, 2023, 619 : 947 - 967