Bug localization using latent Dirichlet allocation

被引:219
|
作者
Lukins, Stacy K. [1 ]
Kraft, Nicholas A. [2 ]
Etzkorn, Letha H. [1 ]
机构
[1] Univ Alabama, Dept Comp Sci, Huntsville, AL 35899 USA
[2] Univ Alabama, Dept Comp Sci, Tuscaloosa, AL 35487 USA
基金
美国国家科学基金会;
关键词
Bug localization; Program comprehension; Latent Dirichlet allocation; Information retrieval; DESIGN INSTABILITY;
D O I
10.1016/j.infsof.2010.04.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Some recent static techniques for automatic bug localization have been built around modern information retrieval (IR) models such as latent semantic indexing (LSI). Latent Dirichlet allocation (LDA) is a generative statistical model that has significant advantages, in modularity and extensibility, over both LSI and probabilistic LSI (pLSI). Moreover, LDA has been shown effective in topic model based information retrieval. In this paper, we present a static LDA-based technique for automatic bug localization and evaluate its effectiveness. Objective: We evaluate the accuracy and scalability of the LDA-based technique and investigate whether it is suitable for use with open-source software systems of varying size, including those developed using agile methods. Method: We present five case studies designed to determine the accuracy and scalability of the LDA-based technique, as well as its relationships to software system size and to source code stability. The studies examine over 300 bugs across more than 25 iterations of three software systems. Results: The results of the studies show that the LDA-based technique maintains sufficient accuracy across all bugs in a single iteration of a software system and is scalable to a large number of bugs across multiple revisions of two software systems. The results of the studies also indicate that the accuracy of the LDA-based technique is not affected by the size of the subject software system or by the stability of its source code base. Conclusion: We conclude that an effective static technique for automatic bug localization can be built around LDA. We also conclude that there is no significant relationship between the accuracy of the LDA-based technique and the size of the subject software system or the stability of its source code base. Thus, the LDA-based technique is widely applicable. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:972 / 990
页数:19
相关论文
共 50 条
  • [31] Distributed Latent Dirichlet Allocation on Streams
    Guo, Yunyan
    Li, Jianzhong
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2022, 16 (01)
  • [32] Parallel Latent Dirichlet Allocation on GPUs
    Moon, Gordon E.
    Nisa, Israt
    Sukumaran-Rajam, Aravind
    Bandyopadhyay, Bortik
    Parthasarathy, Srinivasan
    Sadayappan, P.
    [J]. COMPUTATIONAL SCIENCE - ICCS 2018, PT II, 2018, 10861 : 259 - 272
  • [33] INFERENCE IN SUPERVISED LATENT DIRICHLET ALLOCATION
    Lakshminarayanan, Balaji
    Raich, Raviv
    [J]. 2011 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2011,
  • [34] Slow mixing for Latent Dirichlet Allocation
    Jonasson, Johan
    [J]. STATISTICS & PROBABILITY LETTERS, 2017, 129 : 96 - 100
  • [35] Selecting Priors for Latent Dirichlet Allocation
    Syed, Shaheen
    Spruit, Marco
    [J]. 2018 IEEE 12TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2018, : 194 - 202
  • [36] Latent IBP Compound Dirichlet Allocation
    Archambeau, Cedric
    Lakshminarayanan, Balaji
    Bouchard, Guillaume
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (02) : 321 - 333
  • [37] Crowd labeling latent Dirichlet allocation
    Luca Pion-Tonachini
    Scott Makeig
    Ken Kreutz-Delgado
    [J]. Knowledge and Information Systems, 2017, 53 : 749 - 765
  • [38] Bibliometric Analysis of Latent Dirichlet Allocation
    Garg, Mohit
    Rangra, Priya
    [J]. DESIDOC JOURNAL OF LIBRARY & INFORMATION TECHNOLOGY, 2022, 42 (02): : 105 - 113
  • [39] Labeled Phrase Latent Dirichlet Allocation
    Tang, Yi-Kun
    Mao, Xian-Ling
    Huang, Heyan
    [J]. WEB INFORMATION SYSTEMS ENGINEERING - WISE 2016, PT I, 2016, 10041 : 525 - 536
  • [40] A Spectral Algorithm for Latent Dirichlet Allocation
    Anima Anandkumar
    Dean P. Foster
    Daniel Hsu
    Sham M. Kakade
    Yi-Kai Liu
    [J]. Algorithmica, 2015, 72 : 193 - 214