A Novel Dataset-Similarity-Aware Approach for Evaluating Stability of Software Metric Selection Techniques

被引:0
|
作者
Wang, Huanjing [1 ]
Khoshgoftaar, Taghi M. [1 ]
Wald, Randall [1 ]
Napolitano, Amri [1 ]
机构
[1] Western Kentucky Univ, Bowling Green, KY 42101 USA
关键词
PREDICTION;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Software metric (feature) selection is an important preprocessing step before building software defect prediction models. Although much research has been done analyzing the classification performance of feature selection methods, fewer works have focused on their stability (robustness). Stability is important because feature selection methods which reliably produce the same results despite changes to the data are more trustworthy. Of the papers studying stability, most either compare the features chosen from different random subsamples of the dataset or compare each random subsample with the original dataset. These either result in an unknown degree of overlap between the subsamples, or comparing datasets of different sizes. In this work, we propose a fixed-overlap partition algorithm which generates a pair of subsamples with the same number of instances and a specified degree of overlap. We empirically evaluate the stability of 19 feature selection methods in terms of degree of overlap and feature subset size using sixteen real software metrics datasets. Consistency index is used as the stability measure, and we show that RF is the most stable filter. Results also show that degree of overlap and feature subset size do affect the stability of feature selection methods.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 50 条
  • [41] Performance analysis of attributes selection and discretization of Parkinson’s disease dataset using machine learning techniques: a comprehensive approach
    K. Kamalakannan
    G. Anandharaj
    M. A. Gunavathie
    International Journal of System Assurance Engineering and Management, 2023, 14 : 1523 - 1529
  • [42] A novel approach for visualization, monitoring, and control techniques for Scrum metric planning using the analytic hierarchy process
    Tekin, Nesib
    Yilmaz, Murat
    Clarke, Paul
    JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2023, 35 (08)
  • [43] TIMS: A Novel Approach for Incrementally Few-Shot Text Instance Selection via Model Similarity
    Ju, Tianjie
    Liao, Han
    Liu, Gongshen
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [44] A novel CRITIC-TOPSIS approach for optimal selection of software reliability growth model (SRGM)
    Saxena, Palak
    Kumar, Vijay
    Ram, Mangey
    QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2022, 38 (05) : 2501 - 2520
  • [45] Novel SCN1A variants in Dravet syndrome and evaluating a wide approach of patient selection
    Surovy, Milan
    Soltysova, Andrea
    Kolnikova, Miriam
    Sykora, Pavol
    Ilencikova, Denisa
    Ficek, Andrej
    Radvanszky, Jan
    Kadasi, Ludevit
    GENERAL PHYSIOLOGY AND BIOPHYSICS, 2016, 35 (03) : 333 - 342
  • [46] A novel approach to hyperspectral band selection based on spectral shape similarity analysis and fast branch and bound search
    Li, Shijin
    Qiu, Jianbin
    Yang, Xinxin
    Liu, Huan
    Wan, Dingsheng
    Zhu, Yuelong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2014, 27 : 241 - 250
  • [47] A Novel Approach for Parkinson's Disease Detection Based on Voice Classification and Features Selection Techniques
    Ouhmida, Asmae
    Raihani, Abdelhadi
    Cherradi, Bouchaib
    Terrada, Oumaima
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2021, 17 (10) : 111 - 130
  • [48] A novel approach to define the local region of dynamic selection techniques in imbalanced credit scoring problems
    Melo Junior, Leopoldo
    Nardini, Franco Maria
    Renso, Chiara
    Trani, Roberto
    Macedo, Jose Antonio
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 152
  • [49] Hurdles Techniques (Combined Effects): A Novel Approach for Enhanced Broccoli Florets Stability, Quality, and Safety
    Nagib, Ashraf
    Sami, Rokayya
    Aljumayi, Huda
    Alshehry, Garsa
    Algarni, Eman
    Al-Mushhin, Amina A. M.
    Al-Ghamdi, Saleh
    Alharbi, Zeyad M.
    Aljuhani, Fawaz
    Taha, Ibrahim M.
    JOURNAL OF BIOBASED MATERIALS AND BIOENERGY, 2024, 18 (02) : 269 - 280
  • [50] A Novel Approach for Semantic Similarity Measurement for High Quality Answer Selection in Question Answering using Deep Learning Methods
    Vekariya, Darshana V.
    Limbasiya, Nivid R.
    2020 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2020, : 518 - 522