Combat with Class Overlapping in Software Defect Prediction Using Neighbourhood Metric

被引:0
|
作者
Gupta S. [1 ]
Richa [2 ]
Kumar R. [3 ,4 ]
Jain K.L. [3 ,4 ]
机构
[1] School of Computer Science Engineering, Vellore Institute of Technology, Chennai
[2] Department of Computer science and Engineering, Birla Institute of Technology, Mesra, Ranchi
[3] School of Electronics Engineering, Vellore Institute of Technology, Chennai
[4] School of Computer & Communication Engineering, Manipal University Jaipur, Jaipur
关键词
AUC; Class imbalance; Class overlap; G-mean; Recall; Software defect prediction;
D O I
10.1007/s42979-023-02082-8
中图分类号
学科分类号
摘要
The characteristics of data is a open problem which has been tended perceived in data analysis in machine learning research from last decades. The researcher defined some measures to identify the characteristics of the dataset by applying data complexity measures to find the fitness for purpose. The presence of class overlapping in data-sets, significantly affect performance of the classifiers. Data complexity measures provide quantitative insight in quality of the data set and overlapping existent in it. Machine learning techniques are also utilized by several researchers on healthcare datasets in software defect prediction. In this paper, our aim is to evaluates the effectiveness of new overlap measure: Near Enemy Ratio, and its effect on complexity measures and performance of the classifier. The new ration is based on nearest instances to the target instance. The experimental result offers insights in usefulness of the method and help us decide whether this solution should be applied on a particular data-set or not. © 2023, The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd.
引用
收藏
相关论文
共 50 条
  • [1] METRIC SELECTION FOR SOFTWARE DEFECT PREDICTION
    Wang, Huanjing
    Khoshgoftaar, Taghi M.
    Van Hulse, Jason
    Gao, Kehan
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2011, 21 (02) : 237 - 257
  • [2] Using Class Imbalance Learning for Software Defect Prediction
    Wang, Shuo
    Yao, Xin
    IEEE TRANSACTIONS ON RELIABILITY, 2013, 62 (02) : 434 - 443
  • [3] Handling Class-Imbalance with KNN (Neighbourhood) Under-Sampling for Software Defect Prediction
    Somya Goyal
    Artificial Intelligence Review, 2022, 55 : 2023 - 2064
  • [4] Handling Class-Imbalance with KNN (Neighbourhood) Under-Sampling for Software Defect Prediction
    Goyal, Somya
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (03) : 2023 - 2064
  • [5] A Framework for Software Defect Prediction and Metric Selection
    Huda, Shamsul
    Alyahya, Sultan
    Ali, Mohsin
    Ahmad, Shafiq
    Abawajy, Jemal
    Al-Dossari, Hmood
    Yearwood, John
    IEEE ACCESS, 2018, 6 : 2844 - 2858
  • [6] Software defect prediction model based on distance metric learning
    Cong Jin
    Soft Computing, 2021, 25 : 447 - 461
  • [7] Software defect prediction model based on distance metric learning
    Jin, Cong
    SOFT COMPUTING, 2021, 25 (01) : 447 - 461
  • [8] Cross-Entropy: A New Metric for Software Defect Prediction
    Zhang, Xian
    Ben, Kerong
    Zeng, Jie
    2018 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY (QRS 2018), 2018, : 111 - 122
  • [9] An empirical study on software defect prediction with a simplified metric set
    He, Peng
    Li, Bing
    Liu, Xiao
    Chen, Jun
    Ma, Yutao
    INFORMATION AND SOFTWARE TECHNOLOGY, 2015, 59 : 170 - 190
  • [10] SOFTWARE DEFECT PREVENTION USING MCCABE COMPLEXITY METRIC
    WARD, WT
    HEWLETT-PACKARD JOURNAL, 1989, 40 (02): : 64 - &