Identifying self-admitted technical debt in issue tracking systems using machine learning

被引:11
|
作者
Li, Yikun [1 ]
Soliman, Mohamed [1 ]
Avgeriou, Paris [1 ]
机构
[1] Univ Groningen, Bernoulli Inst Math Comp Sci & Artificial Intelli, Groningen, Netherlands
关键词
Self-admitted technical debt; Technical debt identification; Issue tracking system; Deep learning; Transfer learning; IDENTIFICATION;
D O I
10.1007/s10664-022-10128-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Technical debt is a metaphor indicating sub-optimal solutions implemented for short-term benefits by sacrificing the long-term maintainability and evolvability of software. A special type of technical debt is explicitly admitted by software engineers (e.g. using a TODO comment); this is called Self-Admitted Technical Debt or SATD. Most work on automatically identifying SATD focuses on source code comments. In addition to source code comments, issue tracking systems have shown to be another rich source of SATD, but there are no approaches specifically for automatically identifying SATD in issues. In this paper, we first create a training dataset by collecting and manually analyzing 4,200 issues (that break down to 23,180 sections of issues) from seven open-source projects (i.e., Camel, Chromium, Gerrit, Hadoop, HBase, Impala, and Thrift) using two popular issue tracking systems (i.e., Jira and Google Monorail). We then propose and optimize an approach for automatically identifying SATD in issue tracking systems using machine learning. Our findings indicate that: 1) our approach outperforms baseline approaches by a wide margin with regard to the F1-score; 2) transferring knowledge from suitable datasets can improve the predictive performance of our approach; 3) extracted SATD keywords are intuitive and potentially indicating types and indicators of SATD; 4) projects using different issue tracking systems have less common SATD keywords compared to projects using the same issue tracking system; 5) a small amount of training data is needed to achieve good accuracy.
引用
收藏
页数:37
相关论文
共 50 条
  • [1] Identifying self-admitted technical debt in issue tracking systems using machine learning
    Yikun Li
    Mohamed Soliman
    Paris Avgeriou
    Empirical Software Engineering, 2022, 27
  • [2] Beyond the Code: Mining Self-Admitted Technical Debt in Issue Tracker Systems
    Xavier, Laerte
    Ferreira, Fabio
    Brito, Rodrigo
    Valente, Marco Tulio
    2020 IEEE/ACM 17TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2020, : 137 - 146
  • [3] Wait for it: identifying "On-Hold" self-admitted technical debt
    Maipradit, Rungroj
    Treude, Christoph
    Hata, Hideaki
    Matsumoto, Kenichi
    EMPIRICAL SOFTWARE ENGINEERING, 2020, 25 (05) : 3770 - 3798
  • [4] Wait for it: identifying “On-Hold” self-admitted technical debt
    Rungroj Maipradit
    Christoph Treude
    Hideaki Hata
    Kenichi Matsumoto
    Empirical Software Engineering, 2020, 25 : 3770 - 3798
  • [5] SATDBailiff-mining and tracking self-admitted technical debt
    AlOmar, Eman Abdullah
    Christians, Ben
    Busho, Mihal
    AlKhalid, Ahmed Hamad
    Ouni, Ali
    Newman, Christian
    Mkaouer, Mohamed Wiem
    SCIENCE OF COMPUTER PROGRAMMING, 2022, 213
  • [6] Identification and Remediation of Self-Admitted Technical Debt in Issue Trackers
    Li, Yikun
    Soliman, Mohamed
    Avgeriou, Paris
    2020 46TH EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS (SEAA 2020), 2020, : 495 - 503
  • [7] A survey of self-admitted technical debt
    Sierra, Giancarlo
    Shihab, Emad
    Kamei, Yasutaka
    JOURNAL OF SYSTEMS AND SOFTWARE, 2019, 152 : 70 - 82
  • [8] Correction to: Wait for it: identifying “On-Hold” self-admitted technical debt
    Rungroj Maipradit
    Christoph Treude
    Hideaki Hata
    Kenichi Matsumoto
    Empirical Software Engineering, 2021, 26
  • [9] 23 Shades of Self-Admitted Technical Debt: An Empirical Study on Machine Learning Software
    OBrien, David
    Biswas, Sumon
    Imtiaz, Sayem
    Abdalkareem, Rabe
    Shihab, Emad
    Rajan, Hridesh
    PROCEEDINGS OF THE 30TH ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2022, 2022, : 734 - 746
  • [10] Identifying self-admitted technical debt in open source projects using text mining
    Huang, Qiao
    Shihab, Emad
    Xia, Xin
    Lo, David
    Li, Shanping
    EMPIRICAL SOFTWARE ENGINEERING, 2018, 23 (01) : 418 - 451