Comparative Assessment of Software Quality Classification Techniques: An Empirical Case Study

被引:0
|
作者
Taghi M. Khoshgoftaar
Naeem Seliya
机构
[1] Florida Atlantic University,Empirical Software Engineering Laboratory, Department of Computer Science and Engineering
来源
关键词
Software quality classification; decision trees; case-based reasoning; logistic regression; expected cost of misclassification; analysis of variance;
D O I
暂无
中图分类号
学科分类号
摘要
Software metrics-based quality classification models predict a software module as either fault-prone (fp) or not fault-prone (nfp). Timely application of such models can assist in directing quality improvement efforts to modules that are likely to be fp during operations, thereby cost-effectively utilizing the software quality testing and enhancement resources. Since several classification techniques are available, a relative comparative study of some commonly used classification techniques can be useful to practitioners. We present a comprehensive evaluation of the relative performances of seven classification techniques and/or tools. These include logistic regression, case-based reasoning, classification and regression trees (CART), tree-based classification with S-PLUS, and the Sprint-Sliq, C4.5, and Treedisc algorithms. The use of expected cost of misclassification (ECM), is introduced as a singular unified measure to compare the performances of different software quality classification models. A function of the costs of the Type I (a nfp module misclassified as fp) and Type II (a fp module misclassified as nfp) misclassifications, ECM is computed for different cost ratios. Evaluating software quality classification models in the presence of varying cost ratios is important, because the usefulness of a model is dependent on the system-specific costs of misclassifications. Moreover, models should be compared and preferred for cost ratios that fall within the range of interest for the given system and project domain. Software metrics were collected from four successive releases of a large legacy telecommunications system. A two-way ANOVA randomized-complete block design modeling approach is used, in which the system release is treated as a block, while the modeling method is treated as a factor. It is observed that predictive performances of the models is significantly different across the system releases, implying that in the software engineering domain prediction models are influenced by the characteristics of the data and the system being modeled. Multiple-pairwise comparisons are performed to evaluate the relative performances of the seven models for the cost ratios of interest to the case study. In addition, the performance of the seven classification techniques is also compared with a classification based on lines of code. The comparative approach presented in this paper can also be applied to other software systems.
引用
收藏
页码:229 / 257
页数:28
相关论文
共 50 条
  • [1] Comparative assessment of software quality classification techniques: An empirical case study
    Khoshgoftaar, TM
    Seliya, N
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2004, 9 (03) : 229 - 257
  • [2] Assessment of a New Three-Group Software Quality Classification Technique: An Empirical Case Study
    Taghi M. Khoshgoftaar
    Naeem Seliya
    Kehan Gao
    [J]. Empirical Software Engineering, 2005, 10 : 183 - 218
  • [3] Assessment of a new three-group software quality classification technique: An empirical case study
    Khoshgoftaar, TM
    Seliya, N
    Gao, K
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2005, 10 (02) : 183 - 218
  • [4] Employability Assessment of Agile Methods for Software Quality: An Empirical Case Study
    Wadood, Kamran
    Shahzad, Muhammad Kashif
    Iqbal, Muhammad
    [J]. SYSTEMS, SOFTWARE AND SERVICES PROCESS IMPROVEMENT (EUROSPI 2020), 2020, 1251 : 598 - 614
  • [5] An Empirical Investigation of Filter Attribute Selection Techniques for Software Quality Classification
    Gao, Kehan
    Khoshgoftaar, Taghi M.
    Wang, Huanjing
    [J]. PROCEEDINGS OF THE 2009 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2008, : 272 - +
  • [6] An empirical study of a hybrid approach in software quality classification
    Khoshgoftaar, Taghi
    Gao, Kehan
    [J]. ELEVENTH ISSAT INTERNATIONAL CONFERENCE RELIABILITY AND QUALITY IN DESIGN, PROCEEDINGS, 2005, : 111 - 115
  • [7] Empirical case studies of combining software quality classification models
    Khoshgoftaar, TM
    Geleyn, E
    Nguyen, L
    [J]. THIRD INTERNATIONAL CONFERENCE ON QUALITY SOFTWARE, PROCEEDINGS, 2003, : 40 - 49
  • [8] AN EMPIRICAL STUDY OF FEATURE RANKING TECHNIQUES FOR SOFTWARE QUALITY PREDICTION
    Khoshgoftaar, Taghi M.
    Gao, Kehan
    Napolitano, Amri
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2012, 22 (02) : 161 - 183
  • [9] Data mining techniques for software quality prediction: a comparative study
    Ronchieri, Elisabetta
    Canaparo, Marco
    Costantini, Alessandro
    Duma, Doina Cristina
    [J]. 2018 IEEE NUCLEAR SCIENCE SYMPOSIUM AND MEDICAL IMAGING CONFERENCE PROCEEDINGS (NSS/MIC), 2018,
  • [10] An empirical study of a three-group software quality classification model
    Khoshgoftaar, TM
    Gao, K
    [J]. NINTH ISSAT INTERNATIONAL CONFERENCE ON RELIABILITY AND QUALITY IN DESIGN, 2003 PROCEEDINGS, 2003, : 168 - 172