Enhanced Preprocessing Approach Using Ensemble Machine Learning Algorithms for Detecting Liver Disease

被引:19
|
作者
Md, Abdul Quadir [1 ]
Kulkarni, Sanika [1 ]
Joshua, Christy Jackson [1 ]
Vaichole, Tejas [1 ]
Mohan, Senthilkumar [2 ]
Iwendi, Celestine [3 ]
机构
[1] Vellore Inst Technol, Sch Comp Sci & Engn, Chennai 600127, India
[2] Vellore Inst Technol, Sch Informat Technol & Engn, Vellore 632014, India
[3] Univ Bolton, Sch Creat Technol, Bolton BL3 5AB, England
关键词
liver disease; machine learning; multivariate imputation; feature scaling; ensemble learning; gradient boosting; XGBoost; bagging; random forest; extra tree classifier; stacking; PREDICTION; MODEL;
D O I
10.3390/biomedicines11020581
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
There has been a sharp increase in liver disease globally, and many people are dying without even knowing that they have it. As a result of its limited symptoms, it is extremely difficult to detect liver disease until the very last stage. In the event of early detection, patients can begin treatment earlier, thereby saving their lives. It has become increasingly popular to use ensemble learning algorithms since they perform better than traditional machine learning algorithms. In this context, this paper proposes a novel architecture based on ensemble learning and enhanced preprocessing to predict liver disease using the Indian Liver Patient Dataset (ILPD). Six ensemble learning algorithms are applied to the ILPD, and their results are compared to those obtained with existing studies. The proposed model uses several data preprocessing methods, such as data balancing, feature scaling, and feature selection, to improve the accuracy with appropriate imputations. Multivariate imputation is applied to fill in missing values. On skewed columns, log1p transformation was applied, along with standardization, min-max scaling, maximum absolute scaling, and robust scaling techniques. The selection of features is carried out based on several methods including univariate selection, feature importance, and correlation matrix. These enhanced preprocessed data are trained on Gradient boosting, XGBoost, Bagging, Random Forest, Extra Tree, and Stacking ensemble learning algorithms. The results of the six models were compared with each other, as well as with the models used in other research works. The proposed model using extra tree classifier and random forest, outperformed the other methods with the highest testing accuracy of 91.82% and 86.06%, respectively, portraying our method as a real-world solution for detecting liver disease.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] Detecting Hate Speech on Twitter Network Using Ensemble Machine Learning
    Mutanga, Raymond T.
    Naicker, Nalindren
    Olugbara, Oludayo O.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (03) : 331 - 339
  • [32] An Ensemble Machine Learning Approach for Detecting and Classifying Malware Attacks on Mobile Devices
    Alsharif, Eiman
    Alharby, Maher
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2025,
  • [33] Efficient Data Preprocessing with Ensemble Machine Learning Technique for the Early Detection of Chronic Kidney Disease
    Venkatesan, Vinoth Kumar
    Ramakrishna, Mahesh Thyluru
    Izonin, Ivan
    Tkachenko, Roman
    Havryliuk, Myroslav
    APPLIED SCIENCES-BASEL, 2023, 13 (05):
  • [34] DISEASE FORECAST USING MACHINE LEARNING ALGORITHMS
    Hussain, Mohammed Muzaffar
    Devi, S. Kalpana
    JOURNAL OF APPLIED MATHEMATICS & INFORMATICS, 2022, 40 (5-6): : 1151 - 1165
  • [35] Explainable Heart Disease Prediction Using Ensemble-Quantum Machine Learning Approach
    Abdulsalam, Ghada
    Meshoul, Souham
    Shaiba, Hadil
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 36 (01): : 761 - 779
  • [36] Ensemble Machine Learning Approach for Parkinson's Disease Detection Using Speech Signals
    Bukhari, Syed Nisar Hussain
    Ogudo, Kingsley A.
    MATHEMATICS, 2024, 12 (10)
  • [37] Enhanced Twitter bot detection using ensemble machine learning
    Shukla, Hrushikesh
    Jagtap, Nakshatra
    Patil, Balaji
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT 2021), 2021, : 930 - 936
  • [38] Liver Diseases Classification Using Machine Learning Algorithms
    Jovovic, Ivan
    Grebovic, Marko
    Pokvic, Lejla Gurbeta
    Popovic, Tomo
    Cakic, Stevan
    MEDICON 2023 AND CMBEBIH 2023, VOL 1, 2024, 93 : 585 - 593
  • [39] Securing cloud-enabled smart cities by detecting intrusion using sparkbased stacking ensemble of machine learning algorithms
    Ghazi, Mohd. Rehan
    Raghava, N. S.
    ELECTRONIC RESEARCH ARCHIVE, 2024, 32 (02): : 1268 - 1307
  • [40] Disease Detection Using Ensemble Model in Machine Learning
    Rojalin Mohapatra
    Parimala Kumar Giri
    Irfan Sayyad
    Amaresh Sahu
    Biswajit Brahma
    Nilayam Kumar Kamila
    SN Computer Science, 6 (3)