A Comparative Study of Transformer-based Neural Text Representation Techniques on Bug Triaging

被引:0
|
作者
Dipongkor, Atish Kumar [1 ]
Moran, Kevin [1 ]
机构
[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA
关键词
Bug Triaging; Transformer; LLMs; Text-Embedding; DL4SE; ACCURATE;
D O I
10.1109/ASE56229.2023.00217
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Bug report management has been shown to be an important and time consuming software maintenance task. Often, the first step in managing bug reports is related to triaging a bug to the appropriate developer who is best suited to understand, localize, and fix the target bug. Additionally, assigning a given bug to a particular part of a software project can help to expedite the fixing process. However, despite the importance of these activities, they are quite challenging, where days can be spent on the manual triaging process. Past studies have attempted to leverage the limited textual data of bug reports to train text classification models that automate this process - to varying degrees of success. However, the textual representations and machine learning models used in prior work are limited by their expressiveness, often failing to capture nuanced textual patterns that might otherwise aid in the triaging process. Recently, large, transformer-based, pre-tained neural text representation techniques (i.e., large language models or LLMs) such as BERT and CodeBERT have achieved greater performance with simplified training procedures in several natural language processing tasks, including text classification. However, the potential for using these techniques to improve upon prior approaches for automated bug triaging is not well studied or understood. Therefore, in this paper we offer one of the first investigations that fine-tunes transformer-based language models for the task of bug triaging on four open source datasets, spanning a collective 53 years of development history with over 400 developers and over 150 software project components. Our study includes both a quantitative and qualitative analysis of effectiveness. Our findings illustrate that DeBERTa is the most effective technique across the triaging tasks of developer and component assignment, and the measured performance delta is statistically significant compared to other techniques. However, through our qualitative analysis, we also observe that each technique possesses unique abilities best suited to certain types of bug reports.
引用
收藏
页码:1012 / 1023
页数:12
相关论文
共 50 条
  • [21] Transformer-based Question Text Generation in the Learning System
    Li, Jiajun
    Song, Huazhu
    Li, Jun
    [J]. 6TH INTERNATIONAL CONFERENCE ON INNOVATION IN ARTIFICIAL INTELLIGENCE, ICIAI2022, 2022, : 50 - 56
  • [22] An Empirical Study of Code Smells in Transformer-based Code Generation Techniques
    Siddiq, Mohammed Latif
    Majumder, Shafayat H.
    Mim, Maisha R.
    Jajodia, Sourov
    Santos, Joanna C. S.
    [J]. 2022 IEEE 22ND INTERNATIONAL WORKING CONFERENCE ON SOURCE CODE ANALYSIS AND MANIPULATION (SCAM 2022), 2022, : 71 - 82
  • [23] A Transformer-based Neural Architecture Search Method
    Wang, Shang
    Tang, Huanrong
    Ouyang, Jianquan
    [J]. PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2023 COMPANION, 2023, : 691 - 694
  • [24] A transformer-based neural ODE for dense prediction
    Seyedalireza Khoshsirat
    Chandra Kambhamettu
    [J]. Machine Vision and Applications, 2023, 34
  • [25] A transformer-based neural ODE for dense prediction
    Khoshsirat, Seyedalireza
    Kambhamettu, Chandra
    [J]. MACHINE VISION AND APPLICATIONS, 2023, 34 (06)
  • [26] Transformer-based Neural Network for Electrocardiogram Classification
    Atiea, Mohammed A.
    Adel, Mark
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (11) : 357 - 363
  • [27] Privacy Protection in Transformer-based Neural Network
    Lang, Jiaqi
    Li, Linjing
    Chen, Weiyun
    Zeng, Daniel
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2019, : 182 - 184
  • [28] ASTROMER A transformer-based embedding for the representation of light curves
    Donoso-Oliva, C.
    Becker, I.
    Protopapas, P.
    Cabrera-Vives, G.
    Vishnu, M.
    Vardhan, H.
    [J]. ASTRONOMY & ASTROPHYSICS, 2023, 670
  • [29] Transformer-Based Representation Learning on Temporal Heterogeneous Graphs
    Li, Longhai
    Duan, Lei
    Wang, Junchen
    Xie, Guicai
    He, Chengxin
    Chen, Zihao
    Deng, Song
    [J]. WEB AND BIG DATA, PT II, APWEB-WAIM 2022, 2023, 13422 : 385 - 400
  • [30] Inductor- and transformer-based integrated RF oscillators: A comparative study
    Krishnaswamy, Harish
    Hashemi, Hossein
    [J]. PROCEEDINGS OF THE IEEE 2006 CUSTOM INTEGRATED CIRCUITS CONFERENCE, 2006, : 381 - 384