Modeling naive bayes imputation classification for missing data

被引:1
|
作者
Khotimah, B. K. [1 ,3 ]
Miswanto [1 ,2 ]
Suprajitno, H. [1 ,2 ]
机构
[1] Univ Airlangga, Fac Sci & Technol, Surabaya, Indonesia
[2] Univ Airlangga, Dept Math, Surabaya, Indonesia
[3] Univ Trunojoyo Madura, Dept Informat Engn, Bangkalan, Indonesia
关键词
VALUES;
D O I
10.1088/1755-1315/243/1/012111
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Naive Bayes Imputation (NBI) is used to fill in missing values by replacing the attribute information according to the probability estimate. The NBI process divides the whole data into two sub-sets is the complete data and data containing missing data. Complete data is used for the imputation process at the lost value. The process is repeated for each missing attribute to generate complete data for classification. This research applies NBI for imputation and preprocessing as preparation of classification process. The trial of this study used NBI for imputation compared to using the mean and mode to predict the missing data. The data used for imputation is full train of complete data as a whole to predict the missing value so as to represent the entire data. The results of this study prove that imputation with NBI produces the right imputation with higher accuracy than other imputations. NBI with single imputation and multiple imputation results in better performance because of the right features. This study aims to calculate the effect of missing values on Naive Bayes Imputation Algorithm is based on a probalistic model using mixed data. Empirically shows that the interaction between several methods of imputation and supervised classification results in differences in the performance of classification for the same imputation method.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Naive Bayes vs. Support Vector Machine: Resilience to Missing Data
    Shi, Hongbo
    Liu, Yaqin
    ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PT II, 2011, 7003 : 680 - 687
  • [22] Fuzzy neuron modeling of incomplete data for missing value imputation
    Zhang, Zheng
    Yan, Xiaoming
    Zhang, Liyong
    Lai, Xiaochen
    Lu, Wei
    INFORMATION SCIENCES, 2024, 659
  • [23] Missing data imputation: focusing on single imputation
    Zhang, Zhongheng
    ANNALS OF TRANSLATIONAL MEDICINE, 2016, 4 (01)
  • [24] Impact of missing data imputation methods on gene expression clustering and classification
    de Souto, Marcilio C. P.
    Jaskowiak, Pablo A.
    Costa, Ivan G.
    BMC BIOINFORMATICS, 2015, 16
  • [25] The impact of heterogeneous distance functions on missing data imputation and classification performance
    Santos, Miriam Seoane
    Abreu, Pedro Henriques
    Fernandez, Alberto
    Luengo, Julian
    Santos, Joao
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 111
  • [26] A Genetic Programming-Based Imputation Method for Classification with Missing Data
    Cao Truong Tran
    Zhang, Mengjie
    Andreae, Peter
    GENETIC PROGRAMMING, EUROGP 2016, 2016, 9594 : 149 - 163
  • [27] Effectiveness of Simple Data Imputation for Missing Feature Values in Binary Classification
    Chatterjee, A.
    Woodruff, H.
    Lobbes, M.
    van Wijk, Y.
    Beuque, M.
    Seuntjens, J.
    Lambin, P.
    MEDICAL PHYSICS, 2020, 47 (06) : E609 - E609
  • [28] Evaluation of Machine Learning Classification Algorithms & Missing Data Imputation Techniques
    Nwulu, Nnamdi I.
    2017 INTERNATIONAL ARTIFICIAL INTELLIGENCE AND DATA PROCESSING SYMPOSIUM (IDAP), 2017,
  • [29] Automatic classification of respiratory patterns involving missing data imputation techniques
    Hernandez-Pereira, Elena M.
    Alvarez-Estevez, Diego
    Moret-Bonillo, Vicente
    BIOSYSTEMS ENGINEERING, 2015, 138 : 65 - 76
  • [30] Application of the Modified Imputation Method to Missing Data to Increase Classification Performance
    Caparino, Elenita T.
    Sison, Ariel M.
    Medina, Ruji P.
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS (ICCCS 2019), 2019, : 134 - 139