A New Over-sampling Technique Based on SVM for Imbalanced Diseases Data

被引:0
|
作者
Wang, Jinjin [1 ]
Yao, Yukai [1 ]
Zhou, Hanhai [1 ]
Leng, Mingwei [1 ]
Chen, Xiaoyun [1 ]
机构
[1] Lanzhou Univ, Sch Informat Sci & Engn, Lanzhou 730000, Peoples R China
关键词
Imbalanced diseases data; Over-sampling; Support vector machine; RECOGNITION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In the real world, there are many kinds of diseases data, whose patients are composed of majority normal persons and only minority abnormal ones. Many researchers ignored these imbalance problems, so their learning models usually led to a bias in the majority normal class. To deal with this problem, a new over-sampling technique was proposed to over-sample the minority class to balance the data samples and improve Support Vector Machine(SVM) in imbalanced diseases data sets. For the minority class, a K-Nearest Neighbor(KNN) graph is built. Second, the proposed technique gets a Minimum Spanning Tree(MST) based on the graph. Third, the proposed technique generates synthetic samples by using SMOTE along the direct path in the tree. The performance of the proposed technique based on SVM is evaluated with several diseases data sets taken from the UCI machine learning repository, and the experiments show that the proposed technique based on SVM can improve the Sensitivity value and G-Mean value.
引用
收藏
页码:1224 / 1228
页数:5
相关论文
共 50 条
  • [11] An Approach to Imbalanced Data Classification Based on Instance Selection and Over-Sampling
    Czarnowski, Ireneusz
    Jedrzejowicz, Piotr
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, PT I, 2019, 11683 : 601 - 610
  • [12] Dynamic weighted majority based on over-sampling for imbalanced data streams
    Du, Hongle
    Thelma, Palaoag
    2021 THE 4TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS, CIIS 2021, 2021, : 87 - 95
  • [13] Deep Over-sampling Framework for Classifying Imbalanced Data
    Ando, Shin
    Huang, Chun Yuan
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2017, PT I, 2017, 10534 : 770 - 785
  • [14] Over-sampling methods for mixed data in imbalanced problems
    Alonso, Hugo
    da Costa, Joaquim Fernando Pinto
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024,
  • [15] Synthetic Minority Over-Sampling Technique based on Fuzzy C-means Clustering for Imbalanced Data
    Lee, Hansoo
    Jung, Seunghyan
    Kim, Minseok
    Kimt, Sungshin
    2017 INTERNATIONAL CONFERENCE ON FUZZY THEORY AND ITS APPLICATIONS (IFUZZY), 2017,
  • [16] Handling Autism Imbalanced Data using Synthetic Minority Over-Sampling Technique (SMOTE)
    El-Sayed, Asmaa Ahmed
    Meguid, Nagwa Abdel
    Mahmood, Mahmood Abdel Manem
    Hefny, Hesham Ahmed
    PROCEEDINGS OF 2015 THIRD IEEE WORLD CONFERENCE ON COMPLEX SYSTEMS (WCCS), 2015,
  • [17] Over-Sampling Algorithm Based on VAE in Imbalanced Classification
    Zhang, Chunkai
    Zhou, Ying
    Chen, Yingyang
    Deng, Yepeng
    Wang, Xuan
    Dong, Lifeng
    Wei, Haoyu
    CLOUD COMPUTING - CLOUD 2018, 2018, 10967 : 334 - 344
  • [18] A Novel Cluster based Over-sampling Approach for Classifying Imbalanced Sentiment Data
    Chang, Jing-Rong
    Chen, Long-Sheng
    Lin, Li-Wei
    IAENG International Journal of Computer Science, 2021, 48 (04):
  • [19] A Normal Distribution-Based Over-Sampling Approach to Imbalanced Data Classification
    Zhang, Huaxiang
    Wang, Zhichao
    ADVANCED DATA MINING AND APPLICATIONS, PT I, 2011, 7120 : 83 - 96
  • [20] An Effective Over-sampling Method for Imbalanced Data Sets Classification
    Zhai Yun
    Ma Nan
    Ruan Da
    An Bing
    CHINESE JOURNAL OF ELECTRONICS, 2011, 20 (03): : 489 - 494