A New Over-sampling Technique Based on SVM for Imbalanced Diseases Data

被引:0
|
作者
Wang, Jinjin [1 ]
Yao, Yukai [1 ]
Zhou, Hanhai [1 ]
Leng, Mingwei [1 ]
Chen, Xiaoyun [1 ]
机构
[1] Lanzhou Univ, Sch Informat Sci & Engn, Lanzhou 730000, Peoples R China
关键词
Imbalanced diseases data; Over-sampling; Support vector machine; RECOGNITION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In the real world, there are many kinds of diseases data, whose patients are composed of majority normal persons and only minority abnormal ones. Many researchers ignored these imbalance problems, so their learning models usually led to a bias in the majority normal class. To deal with this problem, a new over-sampling technique was proposed to over-sample the minority class to balance the data samples and improve Support Vector Machine(SVM) in imbalanced diseases data sets. For the minority class, a K-Nearest Neighbor(KNN) graph is built. Second, the proposed technique gets a Minimum Spanning Tree(MST) based on the graph. Third, the proposed technique generates synthetic samples by using SMOTE along the direct path in the tree. The performance of the proposed technique based on SVM is evaluated with several diseases data sets taken from the UCI machine learning repository, and the experiments show that the proposed technique based on SVM can improve the Sensitivity value and G-Mean value.
引用
收藏
页码:1224 / 1228
页数:5
相关论文
共 50 条
  • [31] Clustering boundary over-sampling classification method for imbalanced data sets
    Lou, Xiao-Jun
    Sun, Yu-Xuan
    Liu, Hai-Tao
    Liu, H.-T. (liuhaitao@wsn.cn), 1600, Zhejiang University (47): : 944 - 950
  • [32] Enriched Over-Sampling Techniques for Improving Classification of Imbalanced Big Data
    Patil, Sachin Subhash
    Sonavane, Shefali Pratap
    2017 THIRD IEEE INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (IEEE BIGDATASERVICE 2017), 2017, : 1 - 10
  • [33] An over-sampling expert system for learning from imbalanced data sets
    He, GX
    Han, H
    Wang, WY
    PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : 537 - 541
  • [34] AN IMBALANCED SIGNAL MODULATION CLASSIFICATION AND EVALUATION METHOD BASED ON SYNTHETIC MINORITY OVER-SAMPLING TECHNIQUE
    Liu, Xuebo
    Wang, Yiran
    Bai, Jing
    Li, Haoran
    Wang, Xu
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 6224 - 6227
  • [35] Abstention-SMOTE: An over-sampling approach for imbalanced data classification
    Zhang, Cheng
    Chen, Yufei
    Liu, Xianhui
    Zhao, Xiaodong
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY (ICIT 2017), 2017, : 17 - 21
  • [36] A Learning Approach with Under-and Over-sampling for Imbalanced Data Sets
    Yeh, Chun-Wu
    Li, Der-Chiang
    Lin, Liang-Sian
    Tsai, Tung-I
    PROCEEDINGS 2016 5TH IIAI INTERNATIONAL CONGRESS ON ADVANCED APPLIED INFORMATICS IIAI-AAI 2016, 2016, : 725 - 729
  • [37] Imbalanced Node Classification With Synthetic Over-Sampling
    Zhao, Tianxiang
    Zhang, Xiang
    Wang, Suhang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (12) : 8515 - 8528
  • [38] Preprocessing of Imbalanced Breast Cancer Data using Feature Selection Combined with Over-Sampling Technique for classification
    Jojan, Janjira
    Srivihok, Anongnart
    2013 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2013, : 407 - 412
  • [39] An efficient algorithm coupled with synthetic minority over-sampling technique to classify imbalanced PubChem BioAssay data
    Hao, Ming
    Wang, Yanli
    Bryant, Stephen H.
    ANALYTICA CHIMICA ACTA, 2014, 806 : 117 - 127
  • [40] A New Over-Sampling Approach: Random-SMOTE for Learning from Imbalanced Data Sets
    Dong, Yanjie
    Wang, Xuehua
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, 2011, 7091 : 343 - 352