Missing Data Imputation using Machine Learning Algorithm for Supervised Learning

被引:4
|
作者
Cenitta, D. [1 ]
Arjunan, R. Vijaya [1 ]
Prema, K., V [1 ]
机构
[1] Manipal Inst Technol MAHE, Dept CSE, Manipal, India
关键词
Heart Disease; Data mining; UCI; Decision Tree; Missing Data; !text type='PYTHON']PYTHON[!/text;
D O I
10.1109/ICCC150826.2021.9402558
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With a transience rate of over 18 million per year, Heart Disease (HD) has emerged out to be the lethal disease of the world. Data mining-based heart disease diagnosis systems can surely aid cardiac professionals in a timely diagnosis of the patient's condition. In this proposed work, a Python-based data mining system capable of diagnosing the HD using a Decision Tree has been developed. In the methodology, the UCI data repository was taken into consideration with 14 Attributes. In the dataset, there are few missing values (yet found to be hyperparameter), and pre-processing with such missing values is a common yet challenging problem. A mere substitution will give biased results from the data to be observed for HD diagnosis and will certainly affect the value of the learning process in Machine Learning. Therefore, in the proposed work, a missing value imputation is done, which gave better accuracy, and it is trustable.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] Empirical comparison of supervised learning techniques for missing value imputation
    Chih-Fong Tsai
    Ya-Han Hu
    [J]. Knowledge and Information Systems, 2022, 64 : 1047 - 1075
  • [22] Variable selection with missing data in both covariates and outcomes: Imputation and machine learning
    Hu, Liangyuan
    Lin, Jung-Yi Joyce
    Ji, Jiayi
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2021, 30 (12) : 2651 - 2671
  • [23] Missing Values and Imputation in Healthcare Data: Can Interpretable Machine Learning Help?
    Chen, Zhi
    Tan, Sarah
    Chajewska, Urszula
    Rudin, Cynthia
    Caruana, Rich
    [J]. CONFERENCE ON HEALTH, INFERENCE, AND LEARNING, VOL 209, 2023, 209 : 86 - 99
  • [24] Prediction of concrete strengths enabled by missing data imputation and interpretable machine learning
    Lyngdoh, Gideon A.
    Zaki, Mohd
    Krishnan, N. M. Anoop
    Das, Sumanta
    [J]. CEMENT & CONCRETE COMPOSITES, 2022, 128
  • [25] Machine learning imputation of missing Mesonet temperature observations
    Boomgard-Zagrodnik, Joseph P.
    Brown, David J.
    [J]. COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2022, 192
  • [26] Filter Transfer Learning Algorithm for Missing Data Imputation in Wastewater Treatment Process
    Han, Honggui
    Li, Mengmeng
    Qiao, Junfei
    Yang, Qing
    Peng, Yongzhen
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (12) : 12649 - 12662
  • [27] Missing data imputation of MAGDAS-9′s ground electromagnetism with supervised machine learning and conventional statistical analysis models
    Asraf, Muhammad H.
    Dalila, Nur K. A.
    Tahir, Nooritawati Md
    Abd Latiff, Zatul Iffah
    Jusoh, Mohamad Huzaimy
    Akimasa, Yoshikawa
    [J]. ALEXANDRIA ENGINEERING JOURNAL, 2022, 61 (01) : 937 - 947
  • [28] An Imputation Method for Missing Data Based on an Extreme Learning Machine Auto-Encoder
    Lu, Cheng-Bo
    Mei, Ying
    [J]. IEEE ACCESS, 2018, 6 : 52930 - 52935
  • [29] Enhanced Application of Principal Component Analysis in Machine Learning for Imputation of Missing Traffic Data
    Choi, Yoon-Young
    Shon, Heeseung
    Byon, Young-Ji
    Kim, Dong-Kyu
    Kang, Seungmo
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (10):
  • [30] Locally linear reconstruction based missing value imputation for supervised learning
    Kang, Pilsung
    [J]. NEUROCOMPUTING, 2013, 118 : 65 - 78