Topic-based software defect explanation

被引:15
|
作者
Chen, Tse-Hsun [1 ]
Shang, Weiyi [2 ]
Nagappan, Meiyappan [3 ]
Hassan, Ahmed E. [1 ]
Thomas, Stephen W. [1 ]
机构
[1] Queens Univ, Sch Comp, SAIL, Kingston, ON, Canada
[2] Concordia Univ, Montreal, PQ, Canada
[3] Rochester Inst Technol, Rochester, NY 14623 USA
关键词
Code quality; Topic modeling; LDA; Metrics; Cohesion; Coupling; CONCEPTUAL COHESION; METRICS; PREDICTION;
D O I
10.1016/j.jss.2016.05.015
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Researchers continue to propose metrics using measurable aspects of software systems to understand software quality. However, these metrics largely ignore the functionality, i.e., the conceptual concerns, of software systems. Such concerns are the technical concepts that reflect the system's business logic. For instance, while lines of code may be a good general measure for defects, a large file responsible for simple I/O tasks is likely to have fewer defects than a small file responsible for complicated compiler implementation details. In this paper, we study the effect of concerns on software quality. We use a statistical topic modeling approach to approximate software concerns as topics (related words in source code). We propose various metrics using these topics to help explain the file defect-proneness. Case studies on multiple versions of Firefox, Eclipse, Mylyn, and NetBeans show that (i) some topics are more defect-prone than others; (ii) defect-prone topics tend to remain so over time; (iii) our topic-based metrics provide additional explanatory power for software quality over existing structural and historical metrics; and (iv) our topic-based cohesion metric outperforms state-of-the-art topic-based cohesion and coupling metrics in terms of defect explanatory power, while being simpler to implement and more intuitive to interpret. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:79 / 106
页数:28
相关论文
共 50 条
  • [21] Collaborative topic regression for predicting topic-based social influence
    Hamzehei, Asso
    Wong, Raymond K.
    Koutra, Danai
    Chen, Fang
    [J]. MACHINE LEARNING, 2019, 108 (10) : 1831 - 1850
  • [22] Collaborative topic regression for predicting topic-based social influence
    Asso Hamzehei
    Raymond K. Wong
    Danai Koutra
    Fang Chen
    [J]. Machine Learning, 2019, 108 : 1831 - 1850
  • [23] Content Patterns in Topic-Based Overlapping Communities
    Rios, Sebastian A.
    Munoz, Ricardo
    [J]. SCIENTIFIC WORLD JOURNAL, 2014,
  • [24] A topic-based browser for large online resources
    Stuckenschmidt, H
    de Waard, A
    Bhogal, R
    Fluit, C
    Kampman, A
    van Buel, J
    van Mulligen, E
    Broekstra, J
    Crowlesmith, I
    van Harmelen, F
    Scerri, T
    [J]. ENGINEERING KNOWLEDGE IN THE AGE OF THE SEMANTIC WEB, PROCEEDINGS, 2004, 3257 : 433 - 448
  • [25] Towards Topic-Based Trust in Social Networks
    Knap, Tomas
    Mlynkova, Irena
    [J]. UBIQUITOUS INTELLIGENCE AND COMPUTING, 2010, 6406 : 635 - 649
  • [26] A Discriminative Approach to Topic-Based Citation Recommendation
    Tang, Jie
    Zhang, Jing
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2009, 5476 : 572 - 579
  • [27] Efficient Topic-based Unsupervised Name Disambiguation
    Song, Yang
    Huang, Jian
    Councill, Isaac G.
    Li, Jia
    Giles, C. Lee
    [J]. PROCEEDINGS OF THE 7TH ACM/IEE JOINT CONFERENCE ON DIGITAL LIBRARIES: BUILDING & SUSTAINING THE DIGITAL ENVIRONMENT, 2007, : 342 - +
  • [28] Sentence retrieval with a topic-based language model
    National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100080, China
    [J]. Jisuanji Yanjiu yu Fazhan, 2007, 2 (288-295):
  • [29] CATS: Customizable Abstractive Topic-based Summarization
    Bahrainian, Seyed Ali
    Zerveas, George
    Crestani, Fabio
    Eickhoff, Carsten
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2022, 40 (01)
  • [30] A Topic-based Document Retrieval System Architecture
    Jia, Xiping
    [J]. 2010 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION (PACIIA2010), VOL VIII, 2010, : 80 - 83