A Novel Automatic Query Expansion with Word Embedding for IR-based Bug Localization

被引:3
|
作者
Kim, Misoo [1 ]
Kim, Youngkyoung [2 ]
Lee, Eunseok [3 ]
机构
[1] Sungkyunkwan Univ, Inst Software Convergence, Suwon, South Korea
[2] Sungkyunkwan Univ, Dept Elect & Comp Engn, Suwon, South Korea
[3] Sungkyunkwan Univ, Coll Comp & Informat, Suwon, South Korea
基金
新加坡国家研究基金会;
关键词
Automatic query expansion; Bug report; Information retrieval-based bug localization; Word embedding;
D O I
10.1109/ISSRE52982.2021.00038
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Information retrieval-based bug localization (IRBL) aims at finding buggy files using a bug report as a query. IRBL performance is highly dependent on the query quality. To improve the query quality for IRBL, automatic query expansion (AQE) method has been proposed for identifying query-related terms from the first-retrieved source files. This approach inevitably depends on two determinant of post-retrieval results, the retrieval model and the initial query quality. We propose a novel word embedding-based AQE technique, WEQE, to avoid the heavy dependency of the current AQE approach. Word embedding model enables to fetch terms semantically related to a query by representing words in a vector space. Our method embeds the words from both the global corpus and project-specific-corpus. The initial query is extended by adding words semantically similar to it based on vector representations from our embedding model. We validated the effectiveness of WEQE by using 4,583 bug reports from seven projects, four IRBL models, and two embedding models. Our large-scale experimental results show that WEQE can improve the average precision for bug localization for at least 42% of all queries. Our expanded queries on the best IRBL model achieve a 6% higher mean average precision for bug localization than the initial query.
引用
收藏
页码:276 / 287
页数:12
相关论文
共 50 条
  • [1] A Novel Approach to Automatic Query Reformulation for IR-based Bug Localization
    Kim, Misoo
    Lee, Eunseok
    [J]. SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, : 1752 - 1759
  • [2] ManQ: Many-objective optimization-based automatic query reduction for IR-based bug localization
    Kim, Misoo
    Lee, Eunseok
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2020, 125
  • [3] Improving IR-Based Bug Localization with Context-Aware Query Reformulation
    Rahman, Mohammad Masudur
    Roy, Chanchal K.
    [J]. ESEC/FSE'18: PROCEEDINGS OF THE 2018 26TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2018, : 621 - 632
  • [4] Predicting Effectiveness of IR-Based Bug Localization Techniques
    Le, Tien-Duy B.
    Thung, Ferdian
    Lo, David
    [J]. 2014 IEEE 25TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE), 2014, : 335 - 345
  • [5] Influence of Structured Information in Bug Report Descriptions on IR-based Bug Localization
    Rath, Michael
    Maeder, Patrick
    [J]. 44TH EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS (SEAA 2018), 2018, : 26 - 32
  • [6] Structured information in bug report descriptions—influence on IR-based bug localization and developers
    Michael Rath
    Patrick Mäder
    [J]. Software Quality Journal, 2019, 27 : 1315 - 1337
  • [7] An IR-Based Approach Utilizing Query Expansion for Plagiarism Detection in MEDLINE
    Nawab, Rao Muhammad Adeel
    Stevenson, Mark
    Clough, Paul
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2017, 14 (04) : 796 - 804
  • [8] Structured information in bug report descriptions-influence on IR-based bug localization and developers
    Rath, Michael
    Maeder, Patrick
    [J]. SOFTWARE QUALITY JOURNAL, 2019, 27 (03) : 1315 - 1337
  • [9] The forgotten role of search queries in IR-based bug localization: an empirical study
    Mohammad Masudur Rahman
    Foutse Khomh
    Shamima Yeasmin
    Chanchal K. Roy
    [J]. Empirical Software Engineering, 2021, 26
  • [10] A Large-Scale Comparative Evaluation of IR-Based Tools for Bug Localization
    Akbar, Shayan A.
    Kak, Avinash C.
    [J]. 2020 IEEE/ACM 17TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2020, : 21 - 31