An Empirical Study on the Stability of Explainable Software Defect Prediction

被引:1
|
作者
Shin, Jiho [1 ]
Aleithan, Reem [1 ]
Nam, Jaechang [2 ]
Wang, Junjie [3 ]
Harzevili, Nima Shiri [1 ]
Wang, Song [1 ]
机构
[1] York Univ, Toronto, ON, Canada
[2] Handong Global Univ, Pohang, South Korea
[3] Chinese Acad Sci, Inst Software, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 2023 30TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, APSEC 2023 | 2023年
关键词
Software bugs; static detection; machine learning libraries; FAULTS; MODELS;
D O I
10.1109/APSEC60848.2023.00024
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Explaining the results of software defect prediction (SDP) models is practical but challenging. Jiarpakdee et al. proposed using two model-agnostic techniques (i.e., LIME and BreakDown) to explain prediction results. They showed that model-agnostic techniques can achieve remarkable performance and that the generated explanations can assist developers in understanding the prediction results. However, the fact that they examined these model-agnostic techniques only under a specific SDP setting calls into question their reliability on SDP models under various settings. In this paper, we set out to investigate the reliability and stability of model-agnostic-based explanation generation approaches on SDP models under different settings, e.g., different data sampling techniques, machine learning classifiers, and prediction scenarios used when building SDP models. We use model-agnostic techniques to generate explanations for the same instance under various SDP models with different settings and then check the stability of the generated explanations for the instance. We reused the same defect data and experiment configurations from Jiarpakdee et al. in our experiments. The results show that the examined model-agnostic techniques generate inconsistent explanations under different SDP settings for the same test instances. Our user case study further confirms that inconsistent explanations can significantly affect developers' understanding of the prediction results, which implies that the model-agnostic techniques can be unreliable for practical explanation generation under different SDP settings. To conclude, we urge a revisit of existing model-agnostic-based studies in software engineering and call for more research in explainable SDP toward achieving stable explanation generation.
引用
收藏
页码:141 / 150
页数:10
相关论文
共 50 条
  • [1] Empirical Study of Software Defect Prediction: A Systematic Mapping
    Le Hoang Son
    Pritam, Nakul
    Khari, Manju
    Kumar, Raghvendra
    Pham Thi Minh Phuong
    Pham Huy Thong
    SYMMETRY-BASEL, 2019, 11 (02):
  • [2] An Empirical Study on Software Defect Prediction Using CodeBERT Model
    Pan, Cong
    Lu, Minyan
    Xu, Biao
    APPLIED SCIENCES-BASEL, 2021, 11 (11):
  • [3] An empirical study on software defect prediction with a simplified metric set
    He, Peng
    Li, Bing
    Liu, Xiao
    Chen, Jun
    Ma, Yutao
    INFORMATION AND SOFTWARE TECHNOLOGY, 2015, 59 : 170 - 190
  • [4] An Empirical Study on Regression Techniques for Software Defect Number Prediction
    Wang, Shihan
    He, Yuxin
    Shi, Rongrong
    Jing, Chiyuan
    Liu, Ying
    Tong, Haonan
    PROCEEDINGS OF THE 2023 30TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, APSEC 2023, 2023, : 637 - 638
  • [5] An Empirical Study on Software Defect Prediction using Function Point Analysis
    Zhao, Xinghan
    Tian, Cong
    2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2022, : 167 - 176
  • [6] Empirical Study: How Issue Classification Influences Software Defect Prediction
    Afric, Petar
    Vukadin, Davor
    Silic, Marin
    Delac, Goran
    IEEE ACCESS, 2023, 11 : 11732 - 11748
  • [7] Exploring better alternatives to size metrics for explainable software defect prediction
    Chai, Chenchen
    Fan, Guisheng
    Yu, Huiqun
    Huang, Zijie
    Ding, Jianshu
    Guan, Yao
    SOFTWARE QUALITY JOURNAL, 2024, 32 (02) : 459 - 486
  • [8] The Performance Stability of Defect Prediction Models with Class Imbalance: An Empirical Study
    Yu, Qiao
    Jiang, Shujuan
    Zhang, Yanmei
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (02) : 265 - 272
  • [9] The effect of the dataset size on the accuracy of software defect prediction models: An empirical study
    Alshammari, Mashaan A.
    Alshayeb, Mohammad
    Inteligencia Artificial, 2021, 24 (68) : 72 - 88
  • [10] The Effect of the Dataset Size on the Accuracy of Software Defect Prediction Models: An Empirical Study
    Alshammari, Mashaan A.
    Alshayeb, Mohammad
    INTELIGENCIA ARTIFICIAL-IBEROAMERICAL JOURNAL OF ARTIFICIAL INTELLIGENCE, 2021, 24 (68): : 72 - 88