Comparative analysis of machine learning and ensemble approaches for hepatitis B prediction using data mining with synthetic minority oversampling technique

被引:0
|
作者
Alizargar, Azadeh [1 ]
Chang, Yang-Lang [1 ]
Tan, Tan-Hsu [1 ]
Liu, Tsung-Yu [2 ]
机构
[1] Natl Taipei Univ Technol, Coll Elect Engn & Comp Sci, Dept Elect Engn, Taipei 10608, Taiwan
[2] Lunghwa Univ Sci & Technol, Dept Multimedia & Game Sci, Taoyuan 333326, Taiwan
关键词
Index terms- Hepatitis B; Liver damage; Early detection; Machine learning; Ensemble model; SMOTE; RISK; DIAGNOSIS; VIRUS;
D O I
10.1007/s12553-023-00802-x
中图分类号
R-058 [];
学科分类号
摘要
PurposeHepatitis B, caused by the Hepatitis B virus (HBV), can harm the liver without noticeable symptoms. Early detection is crucial to prevent transmission and enhance recovery. The main goal is to predict Hepatitis B through cost-effective lab test data, by utilizing machine learning. The primary focus is on evaluating the effectiveness of various algorithms in predicting the disease and their potential to enhance early diagnosis capabilities.MethodsSix distinct algorithms (Support Vector Machine, K-nearest Neighbors, Logistic Regression, decision tree, extreme gradient boosting, random forest) were employed alongside an ensemble model. Analysis involved two rounds: considering all features and key attributes. The Synthetic Minority Oversampling Technique (SMOTE) was employed for data imbalance. Various metrics, including the confusion matrix, precision, recall, F1 score, accuracy, receiver operating characteristics (ROC) curve, area under the curve (AUC), and mean absolute error (MAE), were utilized to assess the efficacy of each predictive technique. The National Health and Nutrition Examination Survey (NHANES) dataset was employed.ResultsThe experimental results demonstrate that the ensemble model attained the highest accuracy (97%) and AUC (0.997) in comparison to existing models. The analysis revealed that specific crucial features possess substantial predictive significance within this model.ConclusionThe study underscores the potential of the ensemble model as a valuable tool for medical practitioners, leveraging cost-effective and readily obtainable laboratory test data to predict Hepatitis B with remarkable accuracy. By facilitating early diagnosis and intervention, this research presents a promising avenue to enhance patient outcomes in the context of Hepatitis B.
引用
收藏
页码:109 / 118
页数:10
相关论文
共 50 条
  • [31] Liver Cancer Prediction Using Synthetic Minority based on Probabilistic Distribution (SyMProD) Oversampling Technique
    Kunakorntum, Intouch
    Hinthong, Woranich
    Amonyingchareon, Sumet
    Phunchongharn, Phond
    2019 IEEE 10TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST 2019), 2019, : 93 - 98
  • [32] Detecting congestive heart failure by extracting multimodal features with synthetic minority oversampling technique (SMOTE) for imbalanced data using robust machine learning techniques
    Hussain, Lal
    Lone, Kashif Javed
    Awan, Imtiaz Ahmed
    Abbasi, Adeel Ahmed
    Pirzada, Jawad-ur-Rehman
    WAVES IN RANDOM AND COMPLEX MEDIA, 2022, 32 (03) : 1079 - 1102
  • [33] An Outcome Based Analysis on Heart Disease Prediction using Machine Learning Algorithms and Data Mining Approaches
    Deb, Aushtmi
    Koli, Mst Sadia Akter
    Akter, Sheikh Beauty
    Chowdhury, Adil Ahmed
    2022 IEEE WORLD AI IOT CONGRESS (AIIOT), 2022, : 418 - 424
  • [34] An Ensemble Machine Learning and Data Mining Approach to Enhance Stroke Prediction
    Wijaya, Richard
    Saeed, Faisal
    Samimi, Parnia
    Albarrak, Abdullah M.
    Qasem, Sultan Noman
    BIOENGINEERING-BASEL, 2024, 11 (07):
  • [35] Hepatitis C virus data analysis and prediction using machine learning
    Yaganoglu, Mete
    DATA & KNOWLEDGE ENGINEERING, 2022, 142
  • [36] Prediction of COVID-19 disease severity using synthetic data oversampling and machine learning methods on data at first hospitalization
    Koksal, Kubra
    Dogan, Buket
    Altikardes, Zehra Aysun
    JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY, 2025, 40 (01): : 413 - 427
  • [37] Estimating Accident Reduction Rate after Maritime Traffic Safety Assessment Using Synthetic Minority Oversampling Technique and Machine Learning Algorithm
    Won, Wolseok
    Lim, Minjeong
    Kang, Wonsik
    APPLIED SCIENCES-BASEL, 2024, 14 (08):
  • [38] Enhancing Water Level Prediction Using Ensemble Machine Learning Models: A Comparative Analysis
    Alsulamy, Saleh
    Kumar, Vijendra
    Kisi, Ozgur
    Kedam, Naresh
    Rathnayake, Namal
    WATER RESOURCES MANAGEMENT, 2025,
  • [39] Parkinson's Disease Data Analysis and Prediction Using Ensemble Machine Learning Techniques
    Mali, Rubash
    Sipai, Sushila
    Mali, Drish
    Shakya, Subarna
    MOBILE COMPUTING AND SUSTAINABLE INFORMATICS, 2022, 68 : 327 - 339
  • [40] A comparative ensemble approach to bedload prediction using metaheuristic machine learning
    Mir, Ajaz Ahmad
    Patel, Mahesh
    Albalawi, Fahad
    Bajaj, Mohit
    Tuka, Milkias Berhanu
    SCIENTIFIC REPORTS, 2024, 14 (01):