Bio inspired Ensemble Feature Selection (BEFS) Model with Machine Learning and Data Mining Algorithms for Disease Risk Prediction

被引:3
|
作者
Pasha, Syed Javeed [1 ]
Mohamed, E. Syed [2 ]
机构
[1] BS Abdur Rahman Crescent Inst Sci & Technol, Dept Comp Applicat, Chennai, Tamil Nadu, India
[2] BS Abdur Rahman Crescent Inst Sci & Technol, Dept Comp Sci, Chennai, Tamil Nadu, India
关键词
Bio inspired ensemble feature selection (BEFS) model; machine learning; data mining; feature selection; health care; disease risk prediction; breast cancer risk prediction; genetic algorithm; random forest; logistic regression; BREAST-CANCER; DIAGNOSIS;
D O I
10.1109/iccubea47591.2019.9129304
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Use of machine learning (ML) and data mining (DM) algorithms has surfaced more often in the recent years for disease risk prediction problems in the healthcare communities. Several traditional feature selection models are combined with the DM and ML algorithms to improve accuracy of the disease risk prediction. In this study, a new Bio-inspired Ensemble Feature Selection (BEFS) model is introduced which is applied with the DM and ML algorithms. In the BEFS model, the most relevant and highly contributing features in the prediction are determined with a bio-inspired algorithm i.e., genetic algorithm, and an ensemble algorithm i.e., random forest algorithm. These important features obtained from the proposed model are then combined in various combinations and applied with the DM and ML algorithms, here logistic regression (LR) and random forest (RF), and the results obtained are promising. The experiment is executed using the famous ML language R. To accomplish this objective, the Breast Cancer Wisconsin (Diagnostic) dataset of UCI (University of California, Irvine) ML repository is utilized. In the experimental outcomes, the highest accuracy attained with the BEFS model is 96.49%, the AUC (Area Under Curve) achieved is 96%, and the sensitivity is 98.11%. These results, which greatly improve the disease risk prediction, are higher than several other existing works, while utilizing only six most relevant features out of the thirty two features of the dataset.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Ensemble Gain Ratio Feature Selection (EGFS) Model with Machine Learning and Data Mining Algorithms for Disease Risk Prediction
    Pasha, Syed Javeed
    Mohamed, E. Syed
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT-2020), 2020, : 590 - 596
  • [2] Novel Feature Reduction (NFR) Model With Machine Learning and Data Mining Algorithms for Effective Disease Risk Prediction
    Pasha, Syed Javeed
    Mohamed, E. Syed
    IEEE ACCESS, 2020, 8 : 184087 - 184108
  • [3] New hybrid data mining model for prediction of Salmonella presence in agricultural waters based on ensemble feature selection and machine learning algorithms
    Buyrukoglu, Selim
    JOURNAL OF FOOD SAFETY, 2021, 41 (04)
  • [4] A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring
    Koutanaei, Fatemeh Nemati
    Sajedi, Hedieh
    Khanbabaei, Mohammad
    JOURNAL OF RETAILING AND CONSUMER SERVICES, 2015, 27 : 11 - 23
  • [5] Sarcopenia risk prediction and feature selection by using quantum machine learning algorithms
    Ullah, Ubaid
    Maheshwari, Danyal
    Castillo Olea, Cristian
    Zapirain, Begonya Garcia
    QUANTUM MACHINE INTELLIGENCE, 2024, 6 (02)
  • [6] Heuristic Model to Improve Feature Selection Based on Machine Learning in Data Mining
    Majumdar, Jahin
    Mal, Anwesha
    Gupta, Shruti
    2016 6TH INTERNATIONAL CONFERENCE - CLOUD SYSTEM AND BIG DATA ENGINEERING (CONFLUENCE), 2016, : 73 - 77
  • [7] Data-Driven Diabetes Risk Factor Prediction Using Machine Learning Algorithms with Feature Selection Technique
    Kakoly, Israt Jahan
    Hoque, Md. Rakibul
    Hasan, Najmul
    SUSTAINABILITY, 2023, 15 (06)
  • [8] Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction
    Noroozi, Zeinab
    Orooji, Azam
    Erfannia, Leila
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [9] Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction
    Zeinab Noroozi
    Azam Orooji
    Leila Erfannia
    Scientific Reports, 13
  • [10] Hybrid of Ensemble Machine Learning and Nature-Inspired Algorithms for Divorce Prediction
    Sahle, Kalkidan A.
    Yibre, Abdulkerim M.
    PAN-AFRICAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PT II, PANAFRICON AI 2023, 2024, 2069 : 242 - 264