Modeling naive bayes imputation classification for missing data

被引:1
|
作者
Khotimah, B. K. [1 ,3 ]
Miswanto [1 ,2 ]
Suprajitno, H. [1 ,2 ]
机构
[1] Univ Airlangga, Fac Sci & Technol, Surabaya, Indonesia
[2] Univ Airlangga, Dept Math, Surabaya, Indonesia
[3] Univ Trunojoyo Madura, Dept Informat Engn, Bangkalan, Indonesia
关键词
VALUES;
D O I
10.1088/1755-1315/243/1/012111
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Naive Bayes Imputation (NBI) is used to fill in missing values by replacing the attribute information according to the probability estimate. The NBI process divides the whole data into two sub-sets is the complete data and data containing missing data. Complete data is used for the imputation process at the lost value. The process is repeated for each missing attribute to generate complete data for classification. This research applies NBI for imputation and preprocessing as preparation of classification process. The trial of this study used NBI for imputation compared to using the mean and mode to predict the missing data. The data used for imputation is full train of complete data as a whole to predict the missing value so as to represent the entire data. The results of this study prove that imputation with NBI produces the right imputation with higher accuracy than other imputations. NBI with single imputation and multiple imputation results in better performance because of the right features. This study aims to calculate the effect of missing values on Naive Bayes Imputation Algorithm is based on a probalistic model using mixed data. Empirically shows that the interaction between several methods of imputation and supervised classification results in differences in the performance of classification for the same imputation method.
引用
收藏
页数:10
相关论文
共 50 条
  • [11] A new analytical framework for missing data imputation and classification with uncertainty: Missing data imputation and heart failure readmission prediction
    Hu, Zhiyong
    Du, Dongping
    PLOS ONE, 2020, 15 (09):
  • [12] Study on missing data imputation and modeling for the leaching process
    He, Dakuo
    Wang, Zhengsong
    Yang, Le
    Dai, Wanwan
    CHEMICAL ENGINEERING RESEARCH & DESIGN, 2017, 124 : 1 - 19
  • [13] FCMPSO: An Imputation for Missing Data Features in Heart Disease Classification
    Salleh, Mohd Najib Mohd
    Samat, Nurul Ashikin
    INTERNATIONAL RESEARCH AND INNOVATION SUMMIT (IRIS2017), 2017, 226
  • [14] IMPUTATION OF MISSING DATA
    Lunt, M.
    ANNALS OF THE RHEUMATIC DISEASES, 2014, 73 : 49 - 49
  • [15] Simple data imputation for missing feature values in binary classification
    Chatterjee, Avishek
    Woodruff, Henry
    Vallieres, Martin
    Seuntjens, Jan
    MEDICAL PHYSICS, 2019, 46 (11) : 5378 - 5378
  • [16] Impact of imputation of missing values on classification error for discrete data
    Farhangfar, Alireza
    Kurgan, Lukasz
    Dy, Jennifer
    PATTERN RECOGNITION, 2008, 41 (12) : 3692 - 3705
  • [17] Naive Bayes Classification Algorithm Based on Optimized Training Data
    Zhu, Xiaodan
    Su, Jinsong
    Wu, Qingfeng
    Dong, Huailin
    MECHATRONICS AND INTELLIGENT MATERIALS II, PTS 1-6, 2012, 490-495 : 460 - 464
  • [18] Cost-sensitive Naive Bayes Classification of Uncertain Data
    Zhang, Xing
    Li, Mei
    Zhang, Yang
    Ning, Jifeng
    JOURNAL OF COMPUTERS, 2014, 9 (08) : 1897 - 1903
  • [19] Naive Bayes classification in R
    Zhang, Zhongheng
    ANNALS OF TRANSLATIONAL MEDICINE, 2016, 4 (12) : 1 - 5
  • [20] Improving naive bayes for classification
    Jiang L.
    Cai Z.
    Wang D.
    International Journal of Computers and Applications, 2010, 32 (03) : 328 - 332