Imbalanced metric learning for crashing fault residence prediction

被引:9
|
作者
Xu, Zhou [1 ,2 ,3 ]
Zhao, Kunsong [4 ]
Yan, Meng [1 ,2 ,5 ]
Yuan, Peipei [6 ]
Xu, Ling [1 ,2 ]
Lei, Yan [1 ,2 ]
Zhang, Xiaohong [1 ,2 ]
机构
[1] Chongqing Univ, Minist Educ, Key Lab Dependable Serv Comp Cyber Phys Soc, Chongqing, Peoples R China
[2] Chongqing Univ, Sch Big Data & Software Engn, Chongqing, Peoples R China
[3] Chongqing Univ, Coll Comp Sci, Chongqing, Peoples R China
[4] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China
[5] PengCheng Lab, Shenzhen, Guangdong, Peoples R China
[6] Huazhong Univ Sci & Technol, Sch Elect Informat & Commun, Wuhan, Peoples R China
基金
中国博士后科学基金;
关键词
Crashing fault residence prediction; Stack trace; Metric learning; Class imbalanced learning; STACK-TRACE; LOCALIZATION;
D O I
10.1016/j.jss.2020.110763
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
As the software crash usually does great harm, locating the fault causing the crash (i.e., the crashing fault) has always been a hot research topic. As the stack trace in the crash reports usually contains abundant information related the crash, it is helpful to find the root cause of the crash. Recently, researchers extracted features of the crash, then constructed the classification model on the features to predict whether the crashing fault resides in the stack trace. This process can accelerate the debugging process and save debugging efforts. In this work, we apply a state-of-the-art metric learning method called IML to crash data for crashing fault residence prediction. This method uses Mahalanobis distance based metric learning to learn high-quality feature representation by reducing the distance between crash instances with the same label and increasing the distance between crash instances with different labels. In addition, this method designs a new loss function that includes four types of losses with different weights to cope with the class imbalanced issue of crash data. The experiments on seven open source software projects show that our IML method performs significantly better than nine sampling based and five ensemble based imbalanced learning methods in terms of three performance indicators. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] The impact of class imbalance techniques on crashing fault residence prediction models
    Zhao, Kunsong
    Xu, Zhou
    Yan, Meng
    Zhang, Tao
    Xue, Lei
    Fan, Ming
    Keung, Jacky
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (02)
  • [2] The impact of class imbalance techniques on crashing fault residence prediction models
    Kunsong Zhao
    Zhou Xu
    Meng Yan
    Tao Zhang
    Lei Xue
    Ming Fan
    Jacky Keung
    [J]. Empirical Software Engineering, 2023, 28
  • [3] A comprehensive investigation of the impact of feature selection techniques on crashing fault residence prediction models
    Zhao, Kunsong
    Xu, Zhou
    Yan, Meng
    Zhang, Tao
    Yang, Dan
    Li, Wei
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2021, 139
  • [4] An unsupervised cross project model for crashing fault residence identification
    Liu, Xiao
    Xu, Zhou
    Yang, Dan
    Yan, Meng
    Zhang, Weihan
    Zhao, Haohan
    Xue, Lei
    Fan, Ming
    [J]. IET SOFTWARE, 2022, 16 (06) : 630 - 646
  • [5] Identifying Crashing Fault Residence Based on Cross Project Model
    Xu, Zhou
    Zhang, Tao
    Zhang, Yifeng
    Tang, Yutian
    Liu, Jin
    Luo, Xiapu
    Keung, Jacky
    Cui, Xiaohui
    [J]. 2019 IEEE 30TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE), 2019, : 183 - 194
  • [6] Does the fault reside in a stack trace? Assisting crash localization by predicting crashing fault residence
    Gu, Yongfeng
    Xuan, Jifeng
    Zhang, Hongyu
    Zhang, Lanxin
    Fan, Qingna
    Xie, Xiaoyuan
    Qian, Tieyun
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2019, 148 : 88 - 104
  • [7] A Quadruplet Deep Metric Learning model for imbalanced time-series fault diagnosis
    Gui, Xingtai
    Zhang, Jiyang
    Tang, Jianxiong
    Xu, Hongbing
    Zou, Jianxiao
    Fan, Shicai
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 238
  • [8] Metric Learning from Imbalanced Data
    Gautheron, Leo
    Habrard, Amaury
    Morvant, Emilie
    Sebban, Marc
    [J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 923 - 930
  • [9] Online sequential prediction of bearings imbalanced fault diagnosis by extreme learning machine
    Mao, Wentao
    He, Ling
    Yan, Yunju
    Wang, Jinwan
    [J]. MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2017, 83 : 450 - 473
  • [10] Feature selection and embedding based cross project framework for identifying crashing fault residence
    Xu, Zhou
    Zhang, Tao
    Keung, Jacky
    Yan, Meng
    Luo, Xiapu
    Zhang, Xiaohong
    Xu, Ling
    Tang, Yutian
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2021, 131