Comparative analysis of machine learning and ensemble approaches for hepatitis B prediction using data mining with synthetic minority oversampling technique

被引:0
|
作者
Alizargar, Azadeh [1 ]
Chang, Yang-Lang [1 ]
Tan, Tan-Hsu [1 ]
Liu, Tsung-Yu [2 ]
机构
[1] Natl Taipei Univ Technol, Coll Elect Engn & Comp Sci, Dept Elect Engn, Taipei 10608, Taiwan
[2] Lunghwa Univ Sci & Technol, Dept Multimedia & Game Sci, Taoyuan 333326, Taiwan
关键词
Index terms- Hepatitis B; Liver damage; Early detection; Machine learning; Ensemble model; SMOTE; RISK; DIAGNOSIS; VIRUS;
D O I
10.1007/s12553-023-00802-x
中图分类号
R-058 [];
学科分类号
摘要
PurposeHepatitis B, caused by the Hepatitis B virus (HBV), can harm the liver without noticeable symptoms. Early detection is crucial to prevent transmission and enhance recovery. The main goal is to predict Hepatitis B through cost-effective lab test data, by utilizing machine learning. The primary focus is on evaluating the effectiveness of various algorithms in predicting the disease and their potential to enhance early diagnosis capabilities.MethodsSix distinct algorithms (Support Vector Machine, K-nearest Neighbors, Logistic Regression, decision tree, extreme gradient boosting, random forest) were employed alongside an ensemble model. Analysis involved two rounds: considering all features and key attributes. The Synthetic Minority Oversampling Technique (SMOTE) was employed for data imbalance. Various metrics, including the confusion matrix, precision, recall, F1 score, accuracy, receiver operating characteristics (ROC) curve, area under the curve (AUC), and mean absolute error (MAE), were utilized to assess the efficacy of each predictive technique. The National Health and Nutrition Examination Survey (NHANES) dataset was employed.ResultsThe experimental results demonstrate that the ensemble model attained the highest accuracy (97%) and AUC (0.997) in comparison to existing models. The analysis revealed that specific crucial features possess substantial predictive significance within this model.ConclusionThe study underscores the potential of the ensemble model as a valuable tool for medical practitioners, leveraging cost-effective and readily obtainable laboratory test data to predict Hepatitis B with remarkable accuracy. By facilitating early diagnosis and intervention, this research presents a promising avenue to enhance patient outcomes in the context of Hepatitis B.
引用
收藏
页码:109 / 118
页数:10
相关论文
共 50 条
  • [1] Comparative analysis of machine learning and ensemble approaches for hepatitis B prediction using data mining with synthetic minority oversampling technique
    Azadeh Alizargar
    Yang-Lang Chang
    Tan-Hsu Tan
    Tsung-Yu Liu
    Health and Technology, 2024, 14 : 109 - 118
  • [2] Machine Learning and Synthetic Minority Oversampling Techniques for Imbalanced Data: Improving Machine Failure Prediction
    Wah, Yap Bee
    Ismail, Azlan
    Azid, Nur Niswah Naslina
    Jaafar, Jafreezal
    Aziz, Izzatdin Abdul
    Hasan, Mohd Hilmi
    Zain, Jasni Mohamad
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (03): : 4821 - 4841
  • [3] Credit scoring for a microcredit data set using the synthetic minority oversampling technique and ensemble classifiers
    Gicic, Adaleta
    Subasi, Abdulhamit
    EXPERT SYSTEMS, 2019, 36 (02)
  • [4] Improving the prediction accuracy in blended learning environment using synthetic minority oversampling technique
    Dimic, Gabrijela
    Rancic, Dejan
    Macek, Nemanja
    Spalevic, Petar
    Drasute, Vida
    INFORMATION DISCOVERY AND DELIVERY, 2019, 47 (02) : 76 - 83
  • [5] A Machine Learning-Based Water Potability Prediction Model by Using Synthetic Minority Oversampling Technique and Explainable AI
    Patel, Jinal
    Amipara, Charmi
    Ahanger, Tariq Ahamed
    Ladhva, Komal
    Gupta, Rajeev Kumar
    Alsaab, Hashem O. O.
    Althobaiti, Yusuf S. S.
    Ratna, Rajnish
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [6] A Novel Synthetic Minority Oversampling Technique for Imbalanced Data Set Learning
    Barua, Sukarna
    Islam, Md. Monirul
    Murase, Kazuyuki
    NEURAL INFORMATION PROCESSING, PT II, 2011, 7063 : 735 - +
  • [7] Human Brain Stroke Prediction using Machine Learning Methods with Synthetic Minority Oversampling Approach
    Mujahid, Muhammad
    Ayesha, Noor
    Azar, Ahmad Taher
    Saba, Tanzila
    Haider, Zeeshan
    PROCEEDINGS 2024 SEVENTH INTERNATIONAL WOMEN IN DATA SCIENCE CONFERENCE AT PRINCE SULTAN UNIVERSITY, WIDS-PSU 2024, 2024, : 84 - 89
  • [8] Predicting Patterns of Firms' Vulnerability to Economic Crises Using Open Data, Synthetic Minority Oversampling Technique and Machine Learning
    Ali, Mohsan
    Loukis, Euripidis
    Charalabidis, Yannis
    PERSPECTIVES IN BUSINESS INFORMATICS RESEARCH, BIR 2023, 2023, 493 : 188 - 196
  • [9] Robust diabetic prediction using ensemble machine learning models with synthetic minority over-sampling technique
    Sampath, Pradeepa
    Elangovan, Gurupriya
    Ravichandran, Kaaveya
    Shanmuganathan, Vimal
    Pasupathi, Subbulakshmi
    Chakrabarti, Tulika
    Chakrabarti, Prasun
    Margala, Martin
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [10] Utilization of synthetic minority oversampling technique for improving potato yield prediction using remote sensing data and machine learning algorithms with small sample size of yield data
    Ebrahimy, Hamid
    Wang, Yi
    Zhang, Zhou
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2023, 201 : 12 - 25