Towards automatic labeling of exception handling bugs: A case study of 10 years bug-fixing in Apache Hadoop

被引：0

作者：

da Silva, Antonio Jose A. ^{[1
]}

Vieira, Renan G. ^{[1
]}

Mesquita, Diego P. P. ^{[2
]}

Gomes, Joao Paulo P. ^{[1
]}

Rocha, Lincoln S. ^{[1
]}

机构：

[1] Univ Fed Ceara, Ave Humberto Monte,S-N Pici, BR-60440593 Fortaleza, Ceara, Brazil

[2] Getulio Vargas Fdn, 190 Botafogo, BR-22250900 Rio De Janeiro, RJ, Brazil

来源：

EMPIRICAL SOFTWARE ENGINEERING | 2024年 / 29卷 / 04期

关键词：

Exception handling bug; Automatic bug labeling; Machine learning; and Natural language processing; SOFTWARE; ISSUES; !text type='JAVA']JAVA[!/text;

D O I：

10.1007/s10664-024-10494-0

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

ContextException handling (EH) bugs stem from incorrect usage of exception handling mechanisms (EHMs) and often incur severe consequences (e.g., system downtime, data loss, and security risk). Tracking EH bugs is particularly relevant for contemporary systems (e.g., cloud- and AI-based systems), in which the software's sophisticated logic is an additional threat to the correct use of the EHM. On top of that, bug reporters seldom can tag EH bugs - since it may require an encompassing knowledge of the software's EH strategy. Surprisingly, to the best of our knowledge, there is no automated procedure to identify EH bugs from report descriptions.ObjectiveFirst, we aim to evaluate the extent to which Natural Language Processing (NLP) and Machine Learning (ML) can be used to reliably label EH bugs using the text fields from bug reports (e.g., summary, description, and comments). Second, we aim to provide a reliably labeled dataset that the community can use in future endeavors. Overall, we expect our work to raise the community's awareness regarding the importance of EH bugs.MethodWe manually analyzed 4,516 bug reports from the four main components of Apache's Hadoop project, out of which we labeled approximate to 20%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\approx 20\%$$\end{document} (943) as EH bugs. We also labeled 2,584 non-EH bugs analyzing their bug-fixing code and creating a dataset composed of 7,100 bug reports. Then, we used word embedding techniques (Bag-of-Words and TF-IDF) to summarize the textual fields of bug reports. Subsequently, we used these embeddings to fit five classes of ML methods and evaluate them on unseen data. We also evaluated a pre-trained transformer-based model using the complete textual fields. We have also evaluated whether considering only EH keywords is enough to achieve high predictive performance.ResultsOur results show that using a pre-trained DistilBERT with a linear layer trained with our proposed dataset can reasonably label EH bugs, achieving ROC-AUC scores of up to 0.88. The combination of NLP and ML traditional techniques achieved ROC-AUC scores of up to 0.74 and recall up to 0.56. As a sanity check, we also evaluate methods using embeddings extracted solely from keywords. Considering ROC-AUC as the primary concern, for the majority of ML methods tested, the analysis suggests that keywords alone are not sufficient to characterize reports of EH bugs, although this can change based on other metrics (such as recall and precision) or ML methods (e.g., Random Forest).ConclusionsTo the best of our knowledge, this is the first study addressing the problem of automatic labeling of EH bugs. Based on our results, we can conclude that the use of ML techniques, specially transformer-base models, sounds promising to automate the task of labeling EH bugs. Overall, we hope (i) that our work will contribute towards raising awareness around EH bugs; and (ii) that our (publicly available) dataset will serve as a benchmarking dataset, paving the way for follow-up works. Additionally, our findings can be used to build tools that help maintainers flesh out EH bugs during the triage process.

引用

页数：30

共 1 条

[1] From Reports to Bug-Fix Commits: A 10 Years Dataset of Bug-Fixing Activity from 55 Apache's Open Source Projects
Vieira, Renan
da Silva, Antonio
Rocha, Lincoln
Gomes, Joao Paulo
15TH INTERNATIONAL CONFERENCE ON PREDICTIVE MODELS AND DATA ANALYTICS IN SOFTWARE ENGINEERING (PROMISE'19), 2019, : 80 - 89

← 1 →