Scalable and Systematic Detection of Buggy Inconsistencies in Source Code

被引:29
|
作者
Gabel, Mark [1 ]
Yang, Junfeng [2 ]
Yu, Yuan
Goldszmidt, Moises
Su, Zhendong [1 ]
机构
[1] Univ Calif Davis, Davis, CA 95616 USA
[2] Columbia Univ, New York, NY 10027 USA
关键词
Languages; Reliability; Algorithms; Experimentation;
D O I
10.1145/1932682.1869475
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Software developers often duplicate source code to replicate functionality. This practice can hinder the maintenance of a software project: bugs may arise when two identical code segments are edited inconsistently. This paper presents DejaVu, a highly scalable system for detecting these general syntactic inconsistency bugs. DejaVu operates in two phases. Given a target code base, a parallel inconsistent clone analysis first enumerates all groups of source code fragments that are similar but not identical. Next, an extensible buggy change analysis framework refines these results, separating each group of inconsistent fragments into a fine-grained set of inconsistent changes and classifying each as benign or buggy. On a 75+ million line pre-production commercial code base, DejaVu executed in under five hours and produced a report of over 8,000 potential bugs. Our analysis of a sizable random sample suggests with high likelihood that at this report contains at least 2,000 true bugs and 1,000 code smells. These bugs draw from a diverse class of software defects and are often simple to correct: syntactic inconsistencies both indicate problems and suggest solutions.
引用
收藏
页码:175 / 190
页数:16
相关论文
共 50 条
  • [1] Scalable Source Code Plagiarism Detection Using Source Code Vectors Clustering
    Duracik, Michal
    Krsak, Emil
    Hrkut, Patrik
    [J]. PROCEEDINGS OF 2018 IEEE 9TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2018, : 499 - 502
  • [2] Scalable Source Code Similarity Detection in Large Code Repositories
    Alomari, Firas
    Harbi, Muhammed
    [J]. EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2019, 6 (22) : 1 - 11
  • [3] On the "Naturalness" of Buggy Code
    Ray, Baishakhi
    Hellendoorn, Vincent
    Godhane, Saheel
    Tu, Zhaopeng
    Bacchelli, Alberto
    Devanbu, Premkumar
    [J]. 2016 IEEE/ACM 38TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2016, : 428 - 439
  • [4] Supporting Code Review by Automatic Detection of Potentially Buggy Changes
    Fejzer, Mikolaj
    Wojtyna, Michal
    Burzanska, Marta
    Wisniewski, Piotr
    Stencel, Krzysztof
    [J]. BEYOND DATABASES, ARCHITECTURES AND STRUCTURES, BDAS 2015, 2015, 521 : 473 - 482
  • [5] Scalable source code queries with datalog
    Hajiyev, Elnar
    Verbaere, Mathieu
    de Moor, Oege
    [J]. ECOOP 2006 - OBJECT-ORIENTED PROGRAMMING, PROCEEDINGS, 2006, 4067 : 2 - 27
  • [6] DroidMD: An efficient and scalable Android malware detection approach at source code level
    Akram J.
    Mumtaz M.
    Jabeen G.
    Luo P.
    [J]. International Journal of Information and Computer Security, 2021, 15 (2-3) : 299 - 321
  • [7] SQVDT: A scalable quantitative vulnerability detection technique for source code security assessment
    Akram, Junaid
    Luo, Ping
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 2021, 51 (02): : 294 - 318
  • [8] Cloned Buggy Code Detection in Practice Using Normalized Compression Distance
    Ishio, Takashi
    Maeda, Naoto
    Shibuya, Kensuke
    Inoue, Katsuro
    [J]. PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2018, : 591 - 594
  • [9] Diagnosing and correcting design inconsistencies in source code with logical abduction
    Castro, Sergio
    De Roover, Coen
    Kellens, Andy
    Lozano, Angela
    Mens, Kim
    D'Hondt, Theo
    [J]. SCIENCE OF COMPUTER PROGRAMMING, 2011, 76 (12) : 1113 - 1129
  • [10] Exploring the suitability of source code metrics for indicating architectural inconsistencies
    Lenhard, Jorg
    Blom, Martin
    Herold, Sebastian
    [J]. SOFTWARE QUALITY JOURNAL, 2019, 27 (01) : 241 - 274