New Methodology for Contextual Features Usage in Duplicate Bug Reports Detection

被引:0
|
作者
Neysiani, Behzad Soleimani [1 ]
Babamir, Seyed Morteza [1 ]
机构
[1] Univ Kashan, Fac Comp & Elect Engn, Dept Software Engn, Kashan, Esfahan, Iran
关键词
Information Retrieval; Natural Language Processing; Duplicate Detection; Bug Reports; Topic; Feature Expansion;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Duplicate bug report detection is one of the major problems in software triage systems like Bugzilla to deal with end user requests. User request contains some categorical and especially textual fields which need feature extraction for duplicate detection. Contextual and topical features are acquired using calculating cosine similarity between term frequency or inverse document frequency or BM25F technique from a pair of bug reports against some topics. This research proposes the individual Manhattan distance similarity approach instead of cosine distance similarity for every topic in contextual features to expand the feature dimension which can increase the accuracy of the duplicate bug report detection process. The four famous datasets of bug reports have used for evaluation of the proposed method including Android, Eclipse, Mozilla, and Open Office which the experimental results indicate performance improvement for four contextual features including general, cryptography, network, and Java topics.
引用
收藏
页码:178 / 183
页数:6
相关论文
共 50 条
  • [1] A Novel Technique for Duplicate Detection and Classification of Bug Reports
    Zhang, Tao
    Lee, Byungjeong
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (07): : 1756 - 1768
  • [2] Detection of Duplicate Bug Reports in Jira and Bugzilla Tools
    Aldan, Cigdem
    Demir, Engin
    2020 TURKISH NATIONAL SOFTWARE ENGINEERING SYMPOSIUM (UYMS), 2020, : 126 - 129
  • [3] Preventing duplicate bug reports by continuously querying bug reports
    Abram Hindle
    Curtis Onuczko
    Empirical Software Engineering, 2019, 24 : 902 - 936
  • [4] Preventing duplicate bug reports by continuously querying bug reports
    Hindle, Abram
    Onuczko, Curtis
    EMPIRICAL SOFTWARE ENGINEERING, 2019, 24 (02) : 902 - 936
  • [5] A Contextual Approach towards More Accurate Duplicate Bug Report Detection
    Alipour, Anahita
    Hindle, Abram
    Stroulia, Eleni
    2013 10TH IEEE WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR), 2013, : 183 - 192
  • [6] Improving Performance of Automatic Duplicate Bug Reports Detection using Longest Common Sequence Introducing New Textual Features for Textual Similarity Detection
    Neysiani, Behzad Soleimani
    Babamir, Seyed Morteza
    2019 IEEE 5TH CONFERENCE ON KNOWLEDGE BASED ENGINEERING AND INNOVATION (KBEI 2019), 2019, : 378 - 383
  • [7] A contextual approach towards more accurate duplicate bug report detection and ranking
    Abram Hindle
    Anahita Alipour
    Eleni Stroulia
    Empirical Software Engineering, 2016, 21 : 368 - 410
  • [8] A contextual approach towards more accurate duplicate bug report detection and ranking
    Hindle, Abram
    Alipour, Anahita
    Stroulia, Eleni
    EMPIRICAL SOFTWARE ENGINEERING, 2016, 21 (02) : 368 - 410
  • [9] DURFEX: A Feature Extraction Technique for Efficient Detection of Duplicate Bug Reports
    Sabor, Korosh Koochekian
    Hamou-Lhadj, Abdelwahab
    Larsson, Alf
    2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY (QRS), 2017, : 240 - 250
  • [10] Duplicate Bug Reports Considered Harmful ... Really?
    Bettenburg, Nicolas
    Premraj, Rahul
    Zimmermann, Thomas
    Kim, Sunghun
    2008 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, 2008, : 337 - 345