Comparative Assessment of Software Quality Classification Techniques: An Empirical Case Study

被引:0
|
作者
Taghi M. Khoshgoftaar
Naeem Seliya
机构
[1] Florida Atlantic University,Empirical Software Engineering Laboratory, Department of Computer Science and Engineering
来源
关键词
Software quality classification; decision trees; case-based reasoning; logistic regression; expected cost of misclassification; analysis of variance;
D O I
暂无
中图分类号
学科分类号
摘要
Software metrics-based quality classification models predict a software module as either fault-prone (fp) or not fault-prone (nfp). Timely application of such models can assist in directing quality improvement efforts to modules that are likely to be fp during operations, thereby cost-effectively utilizing the software quality testing and enhancement resources. Since several classification techniques are available, a relative comparative study of some commonly used classification techniques can be useful to practitioners. We present a comprehensive evaluation of the relative performances of seven classification techniques and/or tools. These include logistic regression, case-based reasoning, classification and regression trees (CART), tree-based classification with S-PLUS, and the Sprint-Sliq, C4.5, and Treedisc algorithms. The use of expected cost of misclassification (ECM), is introduced as a singular unified measure to compare the performances of different software quality classification models. A function of the costs of the Type I (a nfp module misclassified as fp) and Type II (a fp module misclassified as nfp) misclassifications, ECM is computed for different cost ratios. Evaluating software quality classification models in the presence of varying cost ratios is important, because the usefulness of a model is dependent on the system-specific costs of misclassifications. Moreover, models should be compared and preferred for cost ratios that fall within the range of interest for the given system and project domain. Software metrics were collected from four successive releases of a large legacy telecommunications system. A two-way ANOVA randomized-complete block design modeling approach is used, in which the system release is treated as a block, while the modeling method is treated as a factor. It is observed that predictive performances of the models is significantly different across the system releases, implying that in the software engineering domain prediction models are influenced by the characteristics of the data and the system being modeled. Multiple-pairwise comparisons are performed to evaluate the relative performances of the seven models for the cost ratios of interest to the case study. In addition, the performance of the seven classification techniques is also compared with a classification based on lines of code. The comparative approach presented in this paper can also be applied to other software systems.
引用
收藏
页码:229 / 257
页数:28
相关论文
共 50 条
  • [31] An empirical study of ensemble techniques for software fault prediction
    Santosh S. Rathore
    Sandeep Kumar
    [J]. Applied Intelligence, 2021, 51 : 3615 - 3644
  • [32] Environmetric techniques in water quality assessment and monitoring: a case study
    Hamid, Aadil
    Bhat, Salim Aijaz
    Bhat, Sami Ullah
    Jehangir, Arshid
    [J]. ENVIRONMENTAL EARTH SCIENCES, 2016, 75 (04)
  • [33] Environmetric techniques in water quality assessment and monitoring: a case study
    Aadil Hamid
    Salim Aijaz Bhat
    Sami Ullah Bhat
    Arshid Jehangir
    [J]. Environmental Earth Sciences, 2016, 75
  • [34] Empirical study of skills assessment for software practitioners
    Al-Khatib, Wasfi G.
    Bukhres, Omran
    Douglas, Patricia
    [J]. Information Sciences Applications, 1995, 4 (02):
  • [35] An Empirical Assessment of Performance of Data Balancing Techniques in Classification Task
    Jadhav, Anil
    Mostafa, Samih M.
    Elmannai, Hela
    Karim, Faten Khalid
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (08):
  • [36] Software Reliability Models and Assessment Techniques Review: Classification Issues
    Maevsky, Dmitry
    Kharchenko, Vyacheslav
    Kolisnyk, Maryna
    Maevskaya, Elena
    [J]. PROCEEDINGS OF THE 2017 9TH IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS: TECHNOLOGY AND APPLICATIONS (IDAACS), VOL 2, 2017, : 894 - 899
  • [37] An Empirical Study Of Software Quality Improvement Practices From Multiple Perspectives - An Australian Case Study
    Land, Lesley Pek Wee
    Higgs, Jeremy
    [J]. PACIFIC ASIA CONFERENCE ON INFORMATION SYSTEMS 2007, SECTIONS 1-6, 2007,
  • [38] Comparative analysis of the resource classification techniques: case study of the Conceicao Mine, Brazil
    de Souza, L.
    Costa, J.
    Koppe, J.
    [J]. TRANSACTIONS OF THE INSTITUTIONS OF MINING AND METALLURGY SECTION B-APPLIED EARTH SCIENCE, 2010, 119 (03): : 166 - 175
  • [39] On use of design patterns in empirical assessment of software design quality
    Khaer, Md. Abul
    Hashem, M. M. A.
    Masud, Md. Raihan
    [J]. 2008 INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION ENGINEERING, VOLS 1-3, 2008, : 133 - 137
  • [40] Empirical assessment of machine learning based software defect prediction techniques
    Challagulla, VUB
    Bastani, FB
    Yen, IL
    Paul, RA
    [J]. WORDS 2005: 10TH IEEE INTERNATIONAL WORKSHOP ON OBJECT-ORIENTED REAL-TIME DEPENDABLE, PROCEEDINGS, 2005, : 263 - 270