A Comparative Study of Transformer-based Neural Text Representation Techniques on Bug Triaging

被引:0
|
作者
Dipongkor, Atish Kumar [1 ]
Moran, Kevin [1 ]
机构
[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA
关键词
Bug Triaging; Transformer; LLMs; Text-Embedding; DL4SE; ACCURATE;
D O I
10.1109/ASE56229.2023.00217
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Bug report management has been shown to be an important and time consuming software maintenance task. Often, the first step in managing bug reports is related to triaging a bug to the appropriate developer who is best suited to understand, localize, and fix the target bug. Additionally, assigning a given bug to a particular part of a software project can help to expedite the fixing process. However, despite the importance of these activities, they are quite challenging, where days can be spent on the manual triaging process. Past studies have attempted to leverage the limited textual data of bug reports to train text classification models that automate this process - to varying degrees of success. However, the textual representations and machine learning models used in prior work are limited by their expressiveness, often failing to capture nuanced textual patterns that might otherwise aid in the triaging process. Recently, large, transformer-based, pre-tained neural text representation techniques (i.e., large language models or LLMs) such as BERT and CodeBERT have achieved greater performance with simplified training procedures in several natural language processing tasks, including text classification. However, the potential for using these techniques to improve upon prior approaches for automated bug triaging is not well studied or understood. Therefore, in this paper we offer one of the first investigations that fine-tunes transformer-based language models for the task of bug triaging on four open source datasets, spanning a collective 53 years of development history with over 400 developers and over 150 software project components. Our study includes both a quantitative and qualitative analysis of effectiveness. Our findings illustrate that DeBERTa is the most effective technique across the triaging tasks of developer and component assignment, and the measured performance delta is statistically significant compared to other techniques. However, through our qualitative analysis, we also observe that each technique possesses unique abilities best suited to certain types of bug reports.
引用
收藏
页码:1012 / 1023
页数:12
相关论文
共 50 条
  • [1] Transformer-based Bug/Feature Classification
    Ozturk, Ceyhun E.
    Yilmaz, Eyup Halit
    Koksal, Omer
    [J]. 2023 31ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2023,
  • [2] TSLocator: A Transformer-Based Approach to Bug Localization
    HUCheng
    XIAOYuliang
    [J]. Wuhan University Journal of Natural Sciences, 2021, 26 (02) : 200 - 206
  • [3] A Comparative Survey of Instance Selection Methods applied to Non-Neural and Transformer-Based Text Classification
    Cunha, Washington
    Viegas, Felipe
    Franca, Celso
    Rosa, Thierson
    Rocha, Leonardo
    Goncalves, Marcos Andre
    [J]. ACM COMPUTING SURVEYS, 2023, 55 (13S)
  • [4] TransHuman: A Transformer-based Human Representation for Generalizable Neural Human Rendering
    Pan, Xiao
    Yang, Zongxin
    Ma, Jianxin
    Zhou, Chang
    Yang, Yi
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3521 - 3532
  • [5] Transformer-based Text Detection in the Wild
    Raisi, Zobeir
    Naiel, Mohamed A.
    Younes, Georges
    Wardell, Steven
    Zelek, John S.
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3156 - 3165
  • [6] Arabic Fake News Detection: Comparative Study of Neural Networks and Transformer-Based Approaches
    Al-Yahya, Maha
    Al-Khalifa, Hend
    Al-Baity, Heyam
    AlSaeed, Duaa
    Essam, Amr
    [J]. COMPLEXITY, 2021, 2021
  • [7] Neural Rule-Execution Tracking Machine For Transformer-Based Text Generation
    Wang, Yufei
    Xu, Can
    Hu, Huang
    Tao, Chongyang
    Wan, Stephen
    Dras, Mark
    Johnson, Mark
    Jiang, Daxin
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [8] Analyzing Amazon Products Sentiment: A Comparative Study of Machine and Deep Learning, and Transformer-Based Techniques
    Ali, Hashir
    Hashmi, Ehtesham
    Yildirim, Sule Yayilgan
    Shaikh, Sarang
    [J]. ELECTRONICS, 2024, 13 (07)
  • [9] TIRec: Transformer-based Invoice Text Recognition
    Chen, Yanlan
    [J]. 2023 2ND ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING, CACML 2023, 2023, : 175 - 180
  • [10] Practical Transformer-based Multilingual Text Classification
    Wang, Cindy
    Banko, Michele
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2021, 2021, : 121 - 129