New Methodology for Contextual Features Usage in Duplicate Bug Reports Detection

被引:0
|
作者
Neysiani, Behzad Soleimani [1 ]
Babamir, Seyed Morteza [1 ]
机构
[1] Univ Kashan, Fac Comp & Elect Engn, Dept Software Engn, Kashan, Esfahan, Iran
关键词
Information Retrieval; Natural Language Processing; Duplicate Detection; Bug Reports; Topic; Feature Expansion;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Duplicate bug report detection is one of the major problems in software triage systems like Bugzilla to deal with end user requests. User request contains some categorical and especially textual fields which need feature extraction for duplicate detection. Contextual and topical features are acquired using calculating cosine similarity between term frequency or inverse document frequency or BM25F technique from a pair of bug reports against some topics. This research proposes the individual Manhattan distance similarity approach instead of cosine distance similarity for every topic in contextual features to expand the feature dimension which can increase the accuracy of the duplicate bug report detection process. The four famous datasets of bug reports have used for evaluation of the proposed method including Android, Eclipse, Mozilla, and Open Office which the experimental results indicate performance improvement for four contextual features including general, cryptography, network, and Java topics.
引用
收藏
页码:178 / 183
页数:6
相关论文
共 50 条
  • [31] Towards Understanding the Impacts of Textual Dissimilarity on Duplicate Bug Report Detection
    Jahan, Sigma
    Rahman, Mohammad Masudur
    2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING, SANER, 2023, : 25 - 36
  • [32] Duplicate Bug Report Detection with a Combination of Information Retrieval and Topic Modeling
    Anh Tuan Nguyen
    Tung Thanh Nguyen
    Nguyen, Tien N.
    Lo, David
    Sun, Chengnian
    2012 PROCEEDINGS OF THE 27TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), 2012, : 70 - 79
  • [33] DENATURE: duplicate detection and type identification in open source bug repositories
    Ruby Chauhan
    Shakshi Sharma
    Anjali Goyal
    International Journal of System Assurance Engineering and Management, 2023, 14 : 275 - 292
  • [34] Bug Reports Prioritization: Which Features and Classifier to Use?
    Alenezi, Mamdouh
    Banitaan, Shadi
    2013 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2013), VOL 2, 2013, : 112 - 116
  • [35] Detecting Duplicate Bug Report Using Character N-Gram-Based Features
    Sureka, Ashish
    Jalote, Pankaj
    17TH ASIA PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2010), 2010, : 366 - 374
  • [36] Duplicate Bug Report Detection by Using Sentence Embedding and Fine-tuning
    Isotani, Haruna
    Washizaki, Hironori
    Fukazawa, Yoshiaki
    Nomoto, Tsutomu
    Ouji, Saori
    Saito, Shinobu
    2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2021), 2021, : 535 - 544
  • [37] Automated Duplicate Bug Report Detection Using Multi-Factor Analysis
    Zou, Jie
    Xu, Ling
    Yang, Mengning
    Zhang, Xiaohong
    Zeng, Jun
    Hirokawa, Sachio
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (07) : 1762 - 1775
  • [38] An Intelligent Duplicate Bug Report Detection Method Based on Technical Term Extraction
    Wu, Xiaoxue
    Shan, Wenjing
    Zheng, Wei
    Chen, Zhiguo
    Ren, Tao
    Sun, Xiaobing
    2023 IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATION OF SOFTWARE TEST, AST, 2023, : 1 - 12
  • [39] POSTER: LWE: LDA refined Word Embeddings for duplicate bug report detection
    Budhiraja, Amar
    Reddy, Raghu
    Shrivastava, Manish
    PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING - COMPANION (ICSE-COMPANION, 2018, : 165 - 166
  • [40] Duplicate Bug Report Detection and Classification System Based on Deep Learning Technique
    Kukkar, Ashima
    Mohana, Rajni
    Kumar, Yugal
    Nayyar, Anand
    Bilal, Muhammad
    Kwak, Kyung-Sup
    IEEE ACCESS, 2020, 8 (08): : 200749 - 200763