A New Over-sampling Technique Based on SVM for Imbalanced Diseases Data

被引:0
|
作者
Wang, Jinjin [1 ]
Yao, Yukai [1 ]
Zhou, Hanhai [1 ]
Leng, Mingwei [1 ]
Chen, Xiaoyun [1 ]
机构
[1] Lanzhou Univ, Sch Informat Sci & Engn, Lanzhou 730000, Peoples R China
关键词
Imbalanced diseases data; Over-sampling; Support vector machine; RECOGNITION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In the real world, there are many kinds of diseases data, whose patients are composed of majority normal persons and only minority abnormal ones. Many researchers ignored these imbalance problems, so their learning models usually led to a bias in the majority normal class. To deal with this problem, a new over-sampling technique was proposed to over-sample the minority class to balance the data samples and improve Support Vector Machine(SVM) in imbalanced diseases data sets. For the minority class, a K-Nearest Neighbor(KNN) graph is built. Second, the proposed technique gets a Minimum Spanning Tree(MST) based on the graph. Third, the proposed technique generates synthetic samples by using SMOTE along the direct path in the tree. The performance of the proposed technique based on SVM is evaluated with several diseases data sets taken from the UCI machine learning repository, and the experiments show that the proposed technique based on SVM can improve the Sensitivity value and G-Mean value.
引用
收藏
页码:1224 / 1228
页数:5
相关论文
共 50 条
  • [41] RWO-Sampling: A random walk over-sampling approach to imbalanced data classification
    Zhang, Huaxiang
    Li, Mingfang
    INFORMATION FUSION, 2014, 20 : 99 - 116
  • [42] Borderline over-sampling in feature space for learning algorithms in imbalanced data environments
    Savetratanakaree, Kittipat (kittipatsavet@gmail.com), 1600, International Association of Engineers (43):
  • [43] Noise Reduction A Priori Synthetic Over-Sampling for class imbalanced data sets
    Rivera, William A.
    INFORMATION SCIENCES, 2017, 408 : 146 - 161
  • [44] Searching for Optimal Oversampling to Process Imbalanced Data: Generative Adversarial Networks and Synthetic Minority Over-Sampling Technique
    Eom, Gayeong
    Byeon, Haewon
    MATHEMATICS, 2023, 11 (16)
  • [45] Classification of imbalanced PubChem BioAssay data using an efficient algorithm coupled with synthetic minority over-sampling technique
    Hao, Ming
    Wang, Yanli
    Bryant, Stephen H.
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2014, 247
  • [46] Over-sampling imbalanced datasets using the covariance matrix
    Leguen-de Varona, Ireimis
    Madera, Julio
    Martínez-López, Yoan
    Hernández-Nieto, José Carlos
    EAI Endorsed Transactions on Energy Web, 2020, 7 (27) : 1 - 6
  • [47] Ensemble based adaptive over-sampling method for imbalanced data learning in computer aided detection of microaneurysm
    Ren, Fulong
    Cao, Peng
    Li, Wei
    Zhao, Dazhe
    Zaiane, Osmar
    COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2017, 55 : 54 - 67
  • [48] A Novel Borderline Over-Sampling Method Based on KNN and Deep Gaussian Mixture Model for Imbalanced Data
    Zhang H.
    Xiao H.
    Yi C.
    Yuan R.
    Data Analysis and Knowledge Discovery, 2023, 7 (05) : 116 - 122
  • [49] A novel ensemble over-sampling approach based Chebyshev inequality for imbalanced multi-label data
    Ren, Weishuo
    Zheng, Yifeng
    Zhang, Wenjie
    Qing, Depeng
    Zeng, Xianlong
    Li, Guohe
    NEUROCOMPUTING, 2025, 612
  • [50] Deep convolutional neural networks with genetic algorithm-based synthetic minority over-sampling technique for improved imbalanced data classification
    Alex, Suja A.
    Nayahi, J. Jesu Vedha
    Kaddoura, Sanaa
    APPLIED SOFT COMPUTING, 2024, 156