Detecting duplicate bug reports with software engineering domain knowledge

被引:30
|
作者
Aggarwal, Karan [1 ]
Timbers, Finbarr [1 ]
Rutgers, Tanner [1 ]
Hindle, Abram [1 ]
Stroulia, Eleni [1 ]
Greiner, Russell [1 ]
机构
[1] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada
关键词
deduplication; documentation; duplicate bug reports; information retrieval; machine learning; software engineering textbooks; software literature;
D O I
10.1002/smr.1821
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Bug deduplication, ie, recognizing bug reports that refer to the same problem, is a challenging task in the software-engineering life cycle. Researchers have proposed several methods primarily relying on information-retrieval techniques. Our work motivated by the intuition that domain knowledge can provide the relevant context to enhance effectiveness, attempts to improve the use of information retrieval by augmenting with software-engineering knowledge. In our previous work, we proposed the software-literature-context method for using software-engineering literature as a source of contextual information to detect duplicates. If bug reports relate to similar subjects, they have a better chance of being duplicates. Our method, being largely automated, has apotential to substantially decrease the level of manual effort involved in conventional techniques with a minor trade-off in accuracy. In this study, we extend our work by demonstrating that domain-specific features can be applied across projects than project-specific features demonstrated previously while still maintaining performance. We also introduce a hierarchy-of-context to capture the software-engineering knowledge in the realms of contextual space to produce performance gains. We also highlight the importance of domain-specific contextual features through cross-domain contexts: adding context improved accuracy; Kappa scores improved by at least 3.8% to 10.8% per project.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] New Methodology for Contextual Features Usage in Duplicate Bug Reports Detection
    Neysiani, Behzad Soleimani
    Babamir, Seyed Morteza
    2019 5TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2019, : 178 - 183
  • [22] A Replication Package for It Takes Two to TANGO: Combining Visual and Textual Information for Detecting Duplicate Video-Based Bug Reports
    Cooper, Nathan
    Bernal-Cardenas, Carlos
    Chaparro, Oscar
    Moran, Kevin
    Poshyvanyk, Denys
    2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2021), 2021, : 160 - 161
  • [23] SOFTWARE MODULE CLASSIFICATION FOR COMMERCIAL BUG REPORTS
    Ozturk, Ceyhun E.
    Yilmaz, Eyup Halit
    Koksal, Omer
    Koc, Aykut
    2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
  • [24] Modeling Domain Knowledge in Support of Requirements Analysis in Software Engineering
    Li, Zhi
    Hall, Jon G.
    Rapanotti, Lucia
    2010 INTERNATIONAL CONFERENCE ON COMMUNICATION AND VEHICULAR TECHNOLOGY (ICCVT 2010), VOL II, 2010, : 270 - 273
  • [25] Identifying and Detecting Inaccurate Stack Traces in Bug Reports
    Bheree, Meher Kiran
    Anvik, John
    2024 7TH INTERNATIONAL CONFERENCE ON SOFTWARE AND SYSTEM ENGINEERING, ICOSSE 2024, 2024, : 9 - 14
  • [26] An HMM-based approach for automatic detection and classification of duplicate bug reports
    Ebrahimi, Neda
    Trabelsi, Abdelaziz
    Islam, Md Shariful
    Hamou-Lhadj, Abdelwahab
    Khanmohammadi, Kobra
    INFORMATION AND SOFTWARE TECHNOLOGY, 2019, 113 : 98 - 109
  • [27] Software engineering and knowledge engineering
    Juristo, N
    Acuña, ST
    EXPERT SYSTEMS WITH APPLICATIONS, 2002, 23 (04) : 345 - 347
  • [28] Towards Word Embeddings for Improved Duplicate Bug Report Retrieval in Software Repositories
    Budhiraja, Amar
    Dutta, Kartik
    Shrivastava, Manish
    Reddy, Raghu
    PROCEEDINGS OF THE 2018 ACM SIGIR INTERNATIONAL CONFERENCE ON THEORY OF INFORMATION RETRIEVAL (ICTIR'18), 2018, : 167 - 170
  • [29] Analyzing Bug Reports by Topic Mining in Software Evolution
    Nguyen, Uy
    Cheng, Kowk Sun
    Cho, Samuel Sungmin
    Song, Myoungkyu
    2021 IEEE 45TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2021), 2021, : 1645 - 1652
  • [30] Invalid bug reports complicate the software aging situation
    Wu, Xiaoxue
    Zheng, Wei
    Pu, Minchao
    Chen, Jie
    Mu, Dejun
    SOFTWARE QUALITY JOURNAL, 2020, 28 (01) : 195 - 220