Detecting duplicate bug reports with software engineering domain knowledge

被引:30
|
作者
Aggarwal, Karan [1 ]
Timbers, Finbarr [1 ]
Rutgers, Tanner [1 ]
Hindle, Abram [1 ]
Stroulia, Eleni [1 ]
Greiner, Russell [1 ]
机构
[1] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada
关键词
deduplication; documentation; duplicate bug reports; information retrieval; machine learning; software engineering textbooks; software literature;
D O I
10.1002/smr.1821
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Bug deduplication, ie, recognizing bug reports that refer to the same problem, is a challenging task in the software-engineering life cycle. Researchers have proposed several methods primarily relying on information-retrieval techniques. Our work motivated by the intuition that domain knowledge can provide the relevant context to enhance effectiveness, attempts to improve the use of information retrieval by augmenting with software-engineering knowledge. In our previous work, we proposed the software-literature-context method for using software-engineering literature as a source of contextual information to detect duplicates. If bug reports relate to similar subjects, they have a better chance of being duplicates. Our method, being largely automated, has apotential to substantially decrease the level of manual effort involved in conventional techniques with a minor trade-off in accuracy. In this study, we extend our work by demonstrating that domain-specific features can be applied across projects than project-specific features demonstrated previously while still maintaining performance. We also introduce a hierarchy-of-context to capture the software-engineering knowledge in the realms of contextual space to produce performance gains. We also highlight the importance of domain-specific contextual features through cross-domain contexts: adding context improved accuracy; Kappa scores improved by at least 3.8% to 10.8% per project.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Detecting Duplicate Bug Reports with Software Engineering Domain Knowledge
    Aggarwal, Karan
    Rutgers, Tanner
    Timbers, Finbarr
    Hindle, Abram
    Greiner, Russ
    Stroulia, Eleni
    2015 22ND INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION, AND REENGINEERING (SANER), 2015, : 211 - 220
  • [2] Detecting Duplicate Bug Reports with Convolutional Neural Networks
    Xie, Qi
    Wen, Zhiyuan
    Zhu, Jieming
    Gao, Cuiyun
    Zheng, Zibin
    2018 25TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2018), 2018, : 416 - 425
  • [3] A Comparison of Summarization Methods for Duplicate Software Bug Reports
    Mukhtar, Samal
    Primadani, Claudia Cahya
    Lee, Seonah
    Jung, Pilsu
    ELECTRONICS, 2023, 12 (16)
  • [4] Coping with Duplicate Bug Reports in Free/Open Source Software Projects
    Davidson, Jennifer L.
    Mohan, Nitin
    Jensen, Carlos
    2011 IEEE SYMPOSIUM ON VISUAL LANGUAGES AND HUMAN-CENTRIC COMPUTING (VL/HCC 2011), 2011, : 101 - 108
  • [5] Preventing duplicate bug reports by continuously querying bug reports
    Abram Hindle
    Curtis Onuczko
    Empirical Software Engineering, 2019, 24 : 902 - 936
  • [6] Preventing duplicate bug reports by continuously querying bug reports
    Hindle, Abram
    Onuczko, Curtis
    EMPIRICAL SOFTWARE ENGINEERING, 2019, 24 (02) : 902 - 936
  • [7] An Approach to Detecting Duplicate Bug Reports using Natural Language and Execution Information
    Wang, Xiaoyin
    Zhang, Lu
    Xie, Tao
    Anvik, John
    Sun, Jiasu
    ICSE'08 PROCEEDINGS OF THE THIRTIETH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, 2008, : 461 - 470
  • [8] Duplicate Bug Reports Considered Harmful ... Really?
    Bettenburg, Nicolas
    Premraj, Rahul
    Zimmermann, Thomas
    Kim, Sunghun
    2008 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, 2008, : 337 - 345
  • [9] Domain knowledge-based security bug reports prediction
    Zheng, Wei
    Cheng, JingYuan
    Wu, Xiaoxue
    Sun, Ruiyang
    Wang, Xiaolong
    Sun, Xiaobing
    KNOWLEDGE-BASED SYSTEMS, 2022, 241
  • [10] Semantic GUI Scene Learning and Video Alignment for Detecting Duplicate Video-based Bug Reports
    William & Mary, Williamsburg
    VA, United States
    不详
    FL, United States
    arXiv,