DENATURE: duplicate detection and type identification in open source bug repositories

被引:1
|
作者
Chauhan, Ruby [1 ]
Sharma, Shakshi [2 ]
Goyal, Anjali [3 ]
机构
[1] NorthCap Univ, Sect 23 A, Gurugram 122017, Haryana, India
[2] Univ Tartu, Tartu, Estonia
[3] Sharda Univ, Sch Engn & Technol, Dept Comp Sci & Engn, Greater Noida, India
关键词
Bug tracking system; Bug reports; Duplicate detection; Bug type identification; Similarity measures; Classification; Information retrieval techniques; CLASSIFICATION; MODEL;
D O I
10.1007/s13198-023-01855-x
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Software projects reckon on the bug tracking systems to guide software maintenance activities. The critical information about the nature of the crash is carried by the bug reports which are submitted to bug repositories. This information is in free form text format and is submitted by users or developers. A large amount of bug reports gets collected in bug repositories. Out of these submitted bugs, many reports are mere identical of the already existing bugs. Furthermore, not all non-duplicate bugs are reproducible in nature. This paper introduces DENATURE, a two step framework for detecting duplication and identifying bug type. The proposed framework will help to minimize time and developer's effort utilized in resolution of bug reports which will further improvise overall software quality. Information retrieval techniques are used for finding duplicate bugs and machine learning classification techniques are used for identifying the type of bug report. Through experiments, we found that the proposed framework obtained prediction accuracy up to 88.81%.
引用
收藏
页码:S275 / S292
页数:18
相关论文
共 50 条
  • [31] Duplicate Bug Report Detection with a Combination of Information Retrieval and Topic Modeling
    Anh Tuan Nguyen
    Tung Thanh Nguyen
    Nguyen, Tien N.
    Lo, David
    Sun, Chengnian
    2012 PROCEEDINGS OF THE 27TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), 2012, : 70 - 79
  • [32] New Methodology for Contextual Features Usage in Duplicate Bug Reports Detection
    Neysiani, Behzad Soleimani
    Babamir, Seyed Morteza
    2019 5TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2019, : 178 - 183
  • [33] Data Mining Behavioral Transitions in Open Source Repositories
    Robinson, William N.
    Deng, Tianjie
    2015 48TH HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES (HICSS), 2015, : 5280 - 5289
  • [34] On the cost of mining very large open source repositories
    Banerjee, Sean
    Cukic, Bojan
    2015 IEEE/ACM 1ST INTERNATIONAL WORKSHOP ON BIG DATA SOFTWARE ENGINEERING, 2015, : 37 - 43
  • [35] MUDABlue: An automatic categorization system for Open Source repositories
    Kawaguchi, Shinji
    Garg, Pankaj K.
    Matsushita, Makoto
    Inoue, Katsuro
    JOURNAL OF SYSTEMS AND SOFTWARE, 2006, 79 (07) : 939 - 953
  • [36] DevRec: A Developer Recommendation System for Open Source Repositories
    Zhang, Xunhui
    Wang, Tao
    Yin, Gang
    Yang, Cheng
    Yu, Yue
    Wang, Huaimin
    MASTERING SCALE AND COMPLEXITY IN SOFTWARE REUSE (ICSR 2017), 2017, 10221 : 3 - 11
  • [37] Towards Mining Norms in Open Source Software Repositories
    Savarimuthu, Bastin Tony Roy
    Dam, Hoa Khanh
    AGENTS AND DATA MINING INTERACTION (ADMI 2013), 2014, 8316 : 26 - 39
  • [38] ABCD open source software for managing ETD repositories
    Dhamdhere, Sangeeta Namdev
    De Smet, Egbert
    Lihitkar, Ramdas
    LIBRARY MANAGEMENT, 2014, 35 (4-5) : 387 - 397
  • [39] MUDABlue: An automatic categorization system for open source repositories
    Kawaguchi, S
    Garg, PK
    Matsushita, M
    Inoue, K
    11TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, PROCEEDINGS, 2004, : 184 - 193
  • [40] CCEyes: An Effective Tool for Code Clone Detection on Large-Scale Open Source Repositories
    Zhang, Yanzhi
    Wang, Tao
    2021 IEEE INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SOFTWARE ENGINEERING (ICICSE 2021), 2021, : 61 - 70