An empirical study on the class imbalance handling techniques for different diseases

被引:1
|
作者
Rhmann W. [1 ]
机构
[1] School of Computer Application, Lovely Professional University, Punjab, Phagwara
关键词
Cost sensitive neural network; Deep learning; Disease; Genetic algorithm; Statistical test;
D O I
10.1007/s00500-024-09881-y
中图分类号
学科分类号
摘要
Machine learning and deep learning-based techniques are now widely used to identify and diagnose various diseases. However, getting a sufficient data for these machine learning models is difficult, and usually collected data is unbalanced i.e. less number of instances with disease class and a large number of classes without the disease. These imbalanced data cause poor performance of the classifier in the detection of minority or disease classes. To address this class imbalance problem for medical data we have applied 17 different class imbalance handling techniques on four publically available datasets with Random forest as a base classifier. The comprehensive review covering the time frame of 1990 to 2023 is also done on “Class imbalance handling techniques” to gather insights using VOSviewer software. Performances of different class imbalance handling techniques are statistically evaluated and impact of different disease datasets on the prediction performance is also statistically assessed. Two novel techniques Genetic Algorithm-Cost sensitive-Deep neural network(GA-CS-DNN) and Class imbalance handling technique-Genetic Algorithm-Deep neural Network(CIH-GA-DNN)are proposed for handling class imbalance problems. Performances of proposed techniques are compared with other state of art class imbalance handling techniques and obtained results showed that OnesidedSelection outperformed all other techniques. A statistical test further demonstrated that OnesidedSelection performs differently than SMOTENN. Significant statistical differences in illness prediction can be seen between the kidney and diabetes, prostate and kidney, and kidney and heart datasets when compared pair-wise. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.
引用
收藏
页码:11439 / 11456
页数:17
相关论文
共 50 条
  • [1] A Comparative Study on Sampling Techniques for Handling Class Imbalance in Streaming Data
    Nguyen, Hien M.
    Cooper, Eric W.
    Kamei, Katsuari
    6TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS, AND THE 13TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS, 2012, : 1762 - 1767
  • [2] Handling Class Imbalance Problem using Oversampling Techniques: A Review
    Gosain, Anjana
    Sardana, Saanchi
    2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 79 - 85
  • [3] Handling class imbalance problem in software maintainability prediction: an empirical investigation
    Ruchika Malhotra
    Kusum Lata
    Frontiers of Computer Science, 2022, 16
  • [4] Handling class imbalance problem in software maintainability prediction:an empirical investigation
    Ruchika MALHOTRA
    Kusum LATA
    Frontiers of Computer Science, 2022, 16 (04) : 9 - 22
  • [5] Handling class imbalance problem in software maintainability prediction: an empirical investigation
    Malhotra, Ruchika
    Lata, Kusum
    FRONTIERS OF COMPUTER SCIENCE, 2022, 16 (04)
  • [6] An empirical comparison of techniques for the class imbalance problem in churn prediction
    Zhu, Bing
    Baesens, Bart
    vanden Broucke, Seppe K. L. M.
    INFORMATION SCIENCES, 2017, 408 : 84 - 99
  • [7] Handling Class Imbalance in Link Prediction Using Learning to Rank Techniques
    Li, Bopeng
    Chaudhuri, Sougata
    Tewari, Ambuj
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 4226 - 4227
  • [8] An Empirical Study on the Performance of Cost-Sensitive Boosting Algorithms with Different Levels of Class Imbalance
    Yin, Qing-Yan
    Zhang, Jiang-She
    Zhang, Chun-Xia
    Liu, Sheng-Cai
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2013, 2013
  • [9] Class Imbalance Issue in Software Defect Prediction Models by various Machine Learning Techniques: An Empirical Study
    Pandey, Sushant Kumar
    Tripathi, Anil Kumar
    2021 8TH INTERNATIONAL CONFERENCE ON SMART COMPUTING AND COMMUNICATIONS (ICSCC), 2021, : 58 - 63
  • [10] Handling Class Imbalance by Estimating Minority Class Statistics
    Ansari, Faizanuddin
    Das, Swagatam
    Shamsolmoali, Pourya
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,