Data Preprocessing for DES-KNN and Its Application to Imbalanced Medical Data Classification

被引:4
|
作者
Kinal, Maciej [1 ]
Wozniak, Michal [1 ]
机构
[1] Wroclaw Univ Sci & Technol, Fac Elect, Dept Syst & Comp Networks, Wroclaw, Poland
关键词
Dynamic ensemble selection; DES-KNN; Data preprocessing; Imbalanced data; Oversampling; SELECTION;
D O I
10.1007/978-3-030-41964-6_51
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Learning from imbalanced data is a vital challenge for pattern classification. We often face the imbalanced data in medical decision tasks where at least one of the classes is represented by only a very small minority of the available data. We propose a novel framework for training base classifiers and preparing the dynamic selection dataset (dsel) to integrate data preprocessing and dynamic ensemble selection (des) methods for imbalanced data classification. des-knn algorithm has been chosen as the des method and its modifications base on oversampled training and validations sets using smote are discussed. The proposed modifications have been evaluated based on computer experiments carried out on 15 medical datasets with various imbalance ratios. The results of experiments show that the proposed framework is very useful, especially for tasks characterized by the small imbalance ratio.
引用
收藏
页码:589 / 599
页数:11
相关论文
共 50 条
  • [1] Imbalanced Data Stream Classification Using Hybrid Data Preprocessing
    Bobowska, Barbara
    Klikowski, Jakub
    Wozniak, Michal
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 1168 : 402 - 413
  • [2] Data Preprocessing and Dynamic Ensemble Selection for Imbalanced Data Stream Classification
    Zyblewski, Pawel
    Sabourin, Robert
    Wozniak, Michal
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 1168 : 367 - 379
  • [3] Imbalanced data preprocessing model for web service classification
    Rhmann, Wasiur
    Ishrat, Amaan
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2024, 15 (10) : 4825 - 4837
  • [4] An individualized preprocessing for medical data classification
    AlMuhaideb, Sarab
    Menai, Mohamed El Bachir
    4TH SYMPOSIUM ON DATA MINING APPLICATIONS (SDMA2016), 2016, 82 : 35 - 42
  • [5] Impact of preprocessing on medical data classification
    Sarab ALMUHAIDEB
    Mohamed El Bachir MENAI
    Frontiers of Computer Science, 2016, 10 (06) : 1082 - 1102
  • [6] Impact of preprocessing on medical data classification
    Sarab Almuhaideb
    Mohamed El Bachir Menai
    Frontiers of Computer Science, 2016, 10 : 1082 - 1102
  • [7] Impact of preprocessing on medical data classification
    Almuhaideb, Sarab
    Menai, Mohamed El Bachir
    FRONTIERS OF COMPUTER SCIENCE, 2016, 10 (06) : 1082 - 1102
  • [8] Analysis of Data Preprocessing Increasing the Oversampling Ratio for Extremely Imbalanced Big Data Classification
    del Rio, Sara
    Benitez, Jose M.
    Herrera, Francisco
    2015 IEEE TRUSTCOM/BIGDATASE/ISPA, VOL 2, 2015, : 180 - 185
  • [9] Application of Preprocessing Methods to Imbalanced Clinical Data: An Experimental Study
    Wilk, Szymon
    Stefanowski, Jerzy
    Wojciechowski, Szymon
    Farion, Ken J.
    Michalowski, Wojtek
    INFORMATION TECHNOLOGIES IN MEDICINE, ITIB 2016, VOL 1, 2016, 471 : 503 - 515
  • [10] ACTIVE SMOTE for Imbalanced Medical Data Classification
    Sena, Raul
    Ben Hamida, Sana
    ADVANCES IN INFORMATION SYSTEMS, ARTIFICIAL INTELLIGENCE AND KNOWLEDGE MANAGEMENT, ICIKS 2023, 2024, 486 : 81 - 97