Identifying self-admitted technical debt in issue tracking systems using machine learning

被引:11
|
作者
Li, Yikun [1 ]
Soliman, Mohamed [1 ]
Avgeriou, Paris [1 ]
机构
[1] Univ Groningen, Bernoulli Inst Math Comp Sci & Artificial Intelli, Groningen, Netherlands
关键词
Self-admitted technical debt; Technical debt identification; Issue tracking system; Deep learning; Transfer learning; IDENTIFICATION;
D O I
10.1007/s10664-022-10128-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Technical debt is a metaphor indicating sub-optimal solutions implemented for short-term benefits by sacrificing the long-term maintainability and evolvability of software. A special type of technical debt is explicitly admitted by software engineers (e.g. using a TODO comment); this is called Self-Admitted Technical Debt or SATD. Most work on automatically identifying SATD focuses on source code comments. In addition to source code comments, issue tracking systems have shown to be another rich source of SATD, but there are no approaches specifically for automatically identifying SATD in issues. In this paper, we first create a training dataset by collecting and manually analyzing 4,200 issues (that break down to 23,180 sections of issues) from seven open-source projects (i.e., Camel, Chromium, Gerrit, Hadoop, HBase, Impala, and Thrift) using two popular issue tracking systems (i.e., Jira and Google Monorail). We then propose and optimize an approach for automatically identifying SATD in issue tracking systems using machine learning. Our findings indicate that: 1) our approach outperforms baseline approaches by a wide margin with regard to the F1-score; 2) transferring knowledge from suitable datasets can improve the predictive performance of our approach; 3) extracted SATD keywords are intuitive and potentially indicating types and indicators of SATD; 4) projects using different issue tracking systems have less common SATD keywords compared to projects using the same issue tracking system; 5) a small amount of training data is needed to achieve good accuracy.
引用
收藏
页数:37
相关论文
共 50 条
  • [41] Using BiLSTM with attention mechanism to automatically detect self-admitted technical debt
    Yu, Dongjin
    Wang, Lin
    Chen, Xin
    Chen, Jie
    FRONTIERS OF COMPUTER SCIENCE, 2021, 15 (04)
  • [42] Self-Admitted Technical Debt and comments' polarity: an empirical study
    Cassee, Nathan
    Zampetti, Fiorella
    Novielli, Nicole
    Serebrenik, Alexander
    Di Penta, Massimiliano
    EMPIRICAL SOFTWARE ENGINEERING, 2022, 27 (06)
  • [43] Self-Admitted Technical Debt and comments’ polarity: an empirical study
    Nathan Cassee
    Fiorella Zampetti
    Nicole Novielli
    Alexander Serebrenik
    Massimiliano Di Penta
    Empirical Software Engineering, 2022, 27
  • [44] Using Natural Language Processing to Automatically Detect Self-Admitted Technical Debt
    Maldonado, Everton da Silva
    Shihab, Emad
    Tsantalis, Nikolaos
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2017, 43 (11) : 1044 - 1062
  • [45] On the value of a prioritization scheme for resolving Self-admitted technical debt
    Mensah, Solomon
    Keung, Jacky
    Svajlenko, Jeffery
    Bennin, Kwabena Ebo
    Mi, Qing
    JOURNAL OF SYSTEMS AND SOFTWARE, 2018, 135 : 37 - 54
  • [46] Recommending when Design Technical Debt Should be Self-Admitted
    Zampetti, Fiorella
    Noiseux, Cedric
    Antoniol, Giuliano
    Khomh, Foutse
    Di Penta, Massimiliano
    2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2017, : 216 - 226
  • [47] Multiclass Classification for Self-Admitted Technical Debt Based on XGBoost
    Chen, Xin
    Yu, Dongjin
    Fan, Xulin
    Wang, Lin
    Chen, Jie
    IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (03) : 1309 - 1324
  • [48] Deep Learning-Based Self-Admitted Technical Debt Detection Empirical Research
    Qu, Yubin
    Bao, Tie
    Yuan, Meng
    Li, Long
    JOURNAL OF INTERNET TECHNOLOGY, 2023, 24 (04): : 975 - 987
  • [49] Detecting and Quantifying Different Types of Self-Admitted Technical Debt
    Maldonado, Everton da S.
    Shihab, Emad
    2015 IEEE 7TH INTERNATIONAL WORKSHOP ON MANAGING TECHNICAL DEBT (MTD) PROCEEDINGS, 2015, : 9 - 15
  • [50] Is Self-Admitted Technical Debt a Good Indicator of Architectural Divergences?
    Sierra, Giancarlo
    Tahmid, Ahmad
    Shihab, Emad
    Tsantalis, Nikolaos
    2019 IEEE 26TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER), 2019, : 534 - 543