A Novel Dataset-Similarity-Aware Approach for Evaluating Stability of Software Metric Selection Techniques

被引:0
|
作者
Wang, Huanjing [1 ]
Khoshgoftaar, Taghi M. [1 ]
Wald, Randall [1 ]
Napolitano, Amri [1 ]
机构
[1] Western Kentucky Univ, Bowling Green, KY 42101 USA
关键词
PREDICTION;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Software metric (feature) selection is an important preprocessing step before building software defect prediction models. Although much research has been done analyzing the classification performance of feature selection methods, fewer works have focused on their stability (robustness). Stability is important because feature selection methods which reliably produce the same results despite changes to the data are more trustworthy. Of the papers studying stability, most either compare the features chosen from different random subsamples of the dataset or compare each random subsample with the original dataset. These either result in an unknown degree of overlap between the subsamples, or comparing datasets of different sizes. In this work, we propose a fixed-overlap partition algorithm which generates a pair of subsamples with the same number of instances and a specified degree of overlap. We empirically evaluate the stability of 19 feature selection methods in terms of degree of overlap and feature subset size using sixteen real software metrics datasets. Consistency index is used as the stability measure, and we show that RF is the most stable filter. Results also show that degree of overlap and feature subset size do affect the stability of feature selection methods.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 50 条
  • [1] A Comparative Study on the Stability of Software Metric Selection Techniques
    Wang, Huanjing
    Khoshgoftaar, Taghi M.
    Wald, Randall
    Napolitano, Amri
    2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 2, 2012, : 307 - 313
  • [2] Novel Image Similarity Metric for Evaluating Denoising and Restoration Techniques
    Ciobanu, Adrian
    Barbu, Tudor
    Nita, Cristina
    2017 IEEE INTERNATIONAL CONFERENCE ON E-HEALTH AND BIOENGINEERING CONFERENCE (EHB), 2017, : 470 - 473
  • [3] Stability of Filter- and Wrapper-based Software Metric Selection Techniques
    Wang, Huanjing
    Khoshgoftaar, Taghi M.
    Napolitano, Amri
    2014 IEEE 15TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2014, : 309 - 314
  • [4] A STUDY OF SOFTWARE METRIC SELECTION TECHNIQUES: STABILITY ANALYSIS AND DEFECT PREDICTION MODEL PERFORMANCE
    Wang, Huanjing
    Khoshgoftaar, Taghi M.
    Liang, Qianhui
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2013, 22 (05)
  • [5] A Novel Approach to Evaluating Similarity in Computer Forensic Investigations
    Hankins, Ryan Q.
    Liu, Jigang
    2014 IEEE INTERNATIONAL CONFERENCE ON ELECTRO/INFORMATION TECHNOLOGY (EIT), 2014, : 567 - 572
  • [6] Evaluating architectural stability using a metric-based approach
    Tonu, Subrina Anjum
    Ashkan, Azin
    Tahvildari, Ladan
    10TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING, PROCEEDINGS, 2006, : 259 - +
  • [7] Stability Aware Software Refactoring Using Hybrid Search Based Techniques
    Vimaladevi, M.
    Zayaraz, G.
    2017 INTERNATIONAL CONFERENCE ON TECHNICAL ADVANCEMENTS IN COMPUTERS AND COMMUNICATIONS (ICTACC), 2017, : 32 - 35
  • [8] BINCODEX: A comprehensive and multi-level dataset for evaluating binary code similarity detection techniques
    Zhang P.
    Wu C.
    Wang Z.
    BenchCouncil Transactions on Benchmarks, Standards and Evaluations, 2024, 4 (02):
  • [9] A feature selection approach based on a similarity measure for software defect prediction
    Yu, Qiao
    Jiang, Shu-juan
    Wang, Rong-cun
    Wang, Hong-yang
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2017, 18 (11) : 1744 - 1753
  • [10] A feature selection approach based on a similarity measure for software defect prediction
    Qiao Yu
    Shu-juan Jiang
    Rong-cun Wang
    Hong-yang Wang
    Frontiers of Information Technology & Electronic Engineering, 2017, 18 : 1744 - 1753