A Comparative Study of Transformer-based Neural Text Representation Techniques on Bug Triaging

被引：0

作者：

Dipongkor, Atish Kumar ^{[1
]}

Moran, Kevin ^{[1
]}

机构：

[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA

来源：

2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE | 2023年

关键词：

Bug Triaging; Transformer; LLMs; Text-Embedding; DL4SE; ACCURATE;

D O I：

10.1109/ASE56229.2023.00217

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Bug report management has been shown to be an important and time consuming software maintenance task. Often, the first step in managing bug reports is related to triaging a bug to the appropriate developer who is best suited to understand, localize, and fix the target bug. Additionally, assigning a given bug to a particular part of a software project can help to expedite the fixing process. However, despite the importance of these activities, they are quite challenging, where days can be spent on the manual triaging process. Past studies have attempted to leverage the limited textual data of bug reports to train text classification models that automate this process - to varying degrees of success. However, the textual representations and machine learning models used in prior work are limited by their expressiveness, often failing to capture nuanced textual patterns that might otherwise aid in the triaging process. Recently, large, transformer-based, pre-tained neural text representation techniques (i.e., large language models or LLMs) such as BERT and CodeBERT have achieved greater performance with simplified training procedures in several natural language processing tasks, including text classification. However, the potential for using these techniques to improve upon prior approaches for automated bug triaging is not well studied or understood. Therefore, in this paper we offer one of the first investigations that fine-tunes transformer-based language models for the task of bug triaging on four open source datasets, spanning a collective 53 years of development history with over 400 developers and over 150 software project components. Our study includes both a quantitative and qualitative analysis of effectiveness. Our findings illustrate that DeBERTa is the most effective technique across the triaging tasks of developer and component assignment, and the measured performance delta is statistically significant compared to other techniques. However, through our qualitative analysis, we also observe that each technique possesses unique abilities best suited to certain types of bug reports.

引用

页码：1012 / 1023

页数：12

共 50 条

[1] Transformer-based Bug/Feature Classification
Ozturk, Ceyhun E.
Yilmaz, Eyup Halit
Koksal, Omer
[J]. 2023 31ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2023,
[2] TSLocator: A Transformer-Based Approach to Bug Localization
HUCheng
XIAOYuliang
[J]. Wuhan University Journal of Natural Sciences, 2021, 26 (02) : 200 - 206
[3] A Comparative Survey of Instance Selection Methods applied to Non-Neural and Transformer-Based Text Classification
Cunha, Washington
Viegas, Felipe
Franca, Celso
Rosa, Thierson
Rocha, Leonardo
Goncalves, Marcos Andre
[J]. ACM COMPUTING SURVEYS, 2023, 55 (13S)
[4] TransHuman: A Transformer-based Human Representation for Generalizable Neural Human Rendering
Pan, Xiao
Yang, Zongxin
Ma, Jianxin
Zhou, Chang
Yang, Yi
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3521 - 3532
[5] Transformer-based Text Detection in the Wild
Raisi, Zobeir
Naiel, Mohamed A.
Younes, Georges
Wardell, Steven
Zelek, John S.
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3156 - 3165
[6] Arabic Fake News Detection: Comparative Study of Neural Networks and Transformer-Based Approaches
Al-Yahya, Maha
Al-Khalifa, Hend
Al-Baity, Heyam
AlSaeed, Duaa
Essam, Amr
[J]. COMPLEXITY, 2021, 2021
[7] Neural Rule-Execution Tracking Machine For Transformer-Based Text Generation
Wang, Yufei
Xu, Can
Hu, Huang
Tao, Chongyang
Wan, Stephen
Dras, Mark
Johnson, Mark
Jiang, Daxin
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[8] Analyzing Amazon Products Sentiment: A Comparative Study of Machine and Deep Learning, and Transformer-Based Techniques
Ali, Hashir
Hashmi, Ehtesham
Yildirim, Sule Yayilgan
Shaikh, Sarang
[J]. ELECTRONICS, 2024, 13 (07)
[9] TIRec: Transformer-based Invoice Text Recognition
Chen, Yanlan
[J]. 2023 2ND ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING, CACML 2023, 2023, : 175 - 180
[10] Practical Transformer-based Multilingual Text Classification
Wang, Cindy
Banko, Michele
[J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2021, 2021, : 121 - 129

← 1 2 3 4 5 →