A Comparative Study of Transformer-based Neural Text Representation Techniques on Bug Triaging

被引：0

作者：

Dipongkor, Atish Kumar ^{[1
]}

Moran, Kevin ^{[1
]}

机构：

[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA

来源：

2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE | 2023年

关键词：

Bug Triaging; Transformer; LLMs; Text-Embedding; DL4SE; ACCURATE;

D O I：

10.1109/ASE56229.2023.00217

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Bug report management has been shown to be an important and time consuming software maintenance task. Often, the first step in managing bug reports is related to triaging a bug to the appropriate developer who is best suited to understand, localize, and fix the target bug. Additionally, assigning a given bug to a particular part of a software project can help to expedite the fixing process. However, despite the importance of these activities, they are quite challenging, where days can be spent on the manual triaging process. Past studies have attempted to leverage the limited textual data of bug reports to train text classification models that automate this process - to varying degrees of success. However, the textual representations and machine learning models used in prior work are limited by their expressiveness, often failing to capture nuanced textual patterns that might otherwise aid in the triaging process. Recently, large, transformer-based, pre-tained neural text representation techniques (i.e., large language models or LLMs) such as BERT and CodeBERT have achieved greater performance with simplified training procedures in several natural language processing tasks, including text classification. However, the potential for using these techniques to improve upon prior approaches for automated bug triaging is not well studied or understood. Therefore, in this paper we offer one of the first investigations that fine-tunes transformer-based language models for the task of bug triaging on four open source datasets, spanning a collective 53 years of development history with over 400 developers and over 150 software project components. Our study includes both a quantitative and qualitative analysis of effectiveness. Our findings illustrate that DeBERTa is the most effective technique across the triaging tasks of developer and component assignment, and the measured performance delta is statistically significant compared to other techniques. However, through our qualitative analysis, we also observe that each technique possesses unique abilities best suited to certain types of bug reports.

引用

页码：1012 / 1023

页数：12

共 50 条

[21] Transformer-based Question Text Generation in the Learning System
Li, Jiajun
Song, Huazhu
Li, Jun
[J]. 6TH INTERNATIONAL CONFERENCE ON INNOVATION IN ARTIFICIAL INTELLIGENCE, ICIAI2022, 2022, : 50 - 56
[22] An Empirical Study of Code Smells in Transformer-based Code Generation Techniques
Siddiq, Mohammed Latif
Majumder, Shafayat H.
Mim, Maisha R.
Jajodia, Sourov
Santos, Joanna C. S.
[J]. 2022 IEEE 22ND INTERNATIONAL WORKING CONFERENCE ON SOURCE CODE ANALYSIS AND MANIPULATION (SCAM 2022), 2022, : 71 - 82
[23] A Transformer-based Neural Architecture Search Method
Wang, Shang
Tang, Huanrong
Ouyang, Jianquan
[J]. PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2023 COMPANION, 2023, : 691 - 694
[24] A transformer-based neural ODE for dense prediction
Seyedalireza Khoshsirat
Chandra Kambhamettu
[J]. Machine Vision and Applications, 2023, 34
[25] A transformer-based neural ODE for dense prediction
Khoshsirat, Seyedalireza
Kambhamettu, Chandra
[J]. MACHINE VISION AND APPLICATIONS, 2023, 34 (06)
[26] Transformer-based Neural Network for Electrocardiogram Classification
Atiea, Mohammed A.
Adel, Mark
[J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (11) : 357 - 363
[27] Privacy Protection in Transformer-based Neural Network
Lang, Jiaqi
Li, Linjing
Chen, Weiyun
Zeng, Daniel
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2019, : 182 - 184
[28] ASTROMER A transformer-based embedding for the representation of light curves
Donoso-Oliva, C.
Becker, I.
Protopapas, P.
Cabrera-Vives, G.
Vishnu, M.
Vardhan, H.
[J]. ASTRONOMY & ASTROPHYSICS, 2023, 670
[29] Transformer-Based Representation Learning on Temporal Heterogeneous Graphs
Li, Longhai
Duan, Lei
Wang, Junchen
Xie, Guicai
He, Chengxin
Chen, Zihao
Deng, Song
[J]. WEB AND BIG DATA, PT II, APWEB-WAIM 2022, 2023, 13422 : 385 - 400
[30] Inductor- and transformer-based integrated RF oscillators: A comparative study
Krishnaswamy, Harish
Hashemi, Hossein
[J]. PROCEEDINGS OF THE IEEE 2006 CUSTOM INTEGRATED CIRCUITS CONFERENCE, 2006, : 381 - 384

← 1 2 3 4 5 →