Enhanced Preprocessing Approach Using Ensemble Machine Learning Algorithms for Detecting Liver Disease

被引:19
|
作者
Md, Abdul Quadir [1 ]
Kulkarni, Sanika [1 ]
Joshua, Christy Jackson [1 ]
Vaichole, Tejas [1 ]
Mohan, Senthilkumar [2 ]
Iwendi, Celestine [3 ]
机构
[1] Vellore Inst Technol, Sch Comp Sci & Engn, Chennai 600127, India
[2] Vellore Inst Technol, Sch Informat Technol & Engn, Vellore 632014, India
[3] Univ Bolton, Sch Creat Technol, Bolton BL3 5AB, England
关键词
liver disease; machine learning; multivariate imputation; feature scaling; ensemble learning; gradient boosting; XGBoost; bagging; random forest; extra tree classifier; stacking; PREDICTION; MODEL;
D O I
10.3390/biomedicines11020581
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
There has been a sharp increase in liver disease globally, and many people are dying without even knowing that they have it. As a result of its limited symptoms, it is extremely difficult to detect liver disease until the very last stage. In the event of early detection, patients can begin treatment earlier, thereby saving their lives. It has become increasingly popular to use ensemble learning algorithms since they perform better than traditional machine learning algorithms. In this context, this paper proposes a novel architecture based on ensemble learning and enhanced preprocessing to predict liver disease using the Indian Liver Patient Dataset (ILPD). Six ensemble learning algorithms are applied to the ILPD, and their results are compared to those obtained with existing studies. The proposed model uses several data preprocessing methods, such as data balancing, feature scaling, and feature selection, to improve the accuracy with appropriate imputations. Multivariate imputation is applied to fill in missing values. On skewed columns, log1p transformation was applied, along with standardization, min-max scaling, maximum absolute scaling, and robust scaling techniques. The selection of features is carried out based on several methods including univariate selection, feature importance, and correlation matrix. These enhanced preprocessed data are trained on Gradient boosting, XGBoost, Bagging, Random Forest, Extra Tree, and Stacking ensemble learning algorithms. The results of the six models were compared with each other, as well as with the models used in other research works. The proposed model using extra tree classifier and random forest, outperformed the other methods with the highest testing accuracy of 91.82% and 86.06%, respectively, portraying our method as a real-world solution for detecting liver disease.
引用
收藏
页数:23
相关论文
共 50 条
  • [41] High Accuracy Predictive Model on Breast Cancer Using Ensemble Approach of Supervised Machine Learning Algorithms
    Kaul, Chaitanya
    Sharma, Neeraj
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PERFORMANCE EVALUATION (COMPE-2021), 2021, : 71 - +
  • [42] Improving Heart Disease Diagnosis: An Ensemble Machine Learning Approach
    Namli, Ozge H.
    Yanik, Seda
    INTELLIGENT AND FUZZY SYSTEMS, VOL 3, INFUS 2024, 2024, 1090 : 92 - 100
  • [43] Prediction of cryptocurrency's price using ensemble machine learning algorithms
    Balijepalli, N. S. S. Kiranmai
    Thangaraj, Viswanathan
    EUROPEAN JOURNAL OF MANAGEMENT AND BUSINESS ECONOMICS, 2025,
  • [44] Early Stage DRC Prediction Using Ensemble Machine Learning Algorithms
    Islam, Riadul
    IEEE CANADIAN JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, 2022, 45 (04): : 354 - 364
  • [45] Remotely sensed desertification modeling using ensemble of machine learning algorithms
    Boali, Abdolhossein
    Asgari, Hamid Reza
    Behbahani, Ali Mohammadian
    Salmanmahiny, Abdolrassoul
    Naimi, Babak
    REMOTE SENSING APPLICATIONS-SOCIETY AND ENVIRONMENT, 2024, 34
  • [46] Enhanced thyroid disease prediction using ensemble machine learning: a high-accuracy approach with feature selection and class balancing
    Md. Rezaul Islam
    Aniruddha Islam Chowdhury
    Sharmin Shama
    Md. Masudul Hasan Lamyea
    Discover Artificial Intelligence, 5 (1):
  • [47] Prediction of Fatty Liver Disease in a Chinese Population Using Machine-Learning Algorithms
    Weng, Shuwei
    Hu, Die
    Chen, Jin
    Yang, Yanyi
    Peng, Daoquan
    DIAGNOSTICS, 2023, 13 (06)
  • [48] Enhanced Machine Learning Algorithms for Precise Heart Disease Diagnosis
    Sarika, Shinde G.
    Kharade, K. G.
    Kamat, R. K.
    2024 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND APPLIED INFORMATICS, ACCAI 2024, 2024,
  • [49] A Preprocessing Perspective for Quantum Machine Learning Classification Advantage in Finance Using NISQ Algorithms
    Mancilla, Javier
    Pere, Christophe
    ENTROPY, 2022, 24 (11)
  • [50] Improving Classification Using Preprocessing and Machine Learning Algorithms on NSL-KDD Dataset
    Deshmukh, Datta H.
    Ghorpade, Tushar
    Padiya, Puja
    2015 INTERNATIONAL CONFERENCE ON COMMUNICATION, INFORMATION & COMPUTING TECHNOLOGY (ICCICT), 2015,