DeepBugs: A learning approach to name-based bug detection

被引:191
|
作者
Pradel M. [1 ]
Sen K. [2 ,3 ]
机构
[1] TU Darmstadt, Department of Computer Science
[2] University of California, Berkeley
关键词
Bug detection; !text type='Java']Java[!/text]Script; Machine learning; Name-based program analysis; Natural language;
D O I
10.1145/3276517
中图分类号
学科分类号
摘要
Natural language elements in source code, e.g., the names of variables and functions, convey useful information. However, most existing bug detection tools ignore this information and therefore miss some classes of bugs. The few existing name-based bug detection approaches reason about names on a syntactic level and rely on manually designed and tuned algorithms to detect bugs. This paper presents DeepBugs, a learning approach to name-based bug detection, which reasons about names based on a semantic representation and which automatically learns bug detectors instead of manually writing them. We formulate bug detection as a binary classification problem and train a classifier that distinguishes correct from incorrect code. To address the challenge that effectively learning a bug detector requires examples of both correct and incorrect code, we create likely incorrect code examples from an existing corpus of code through simple code transformations. A novel insight learned from our work is that learning from artificially seeded bugs yields bug detectors that are effective at finding bugs in real-world code. We implement our idea into a framework for learning-based and name-based bug detection. Three bug detectors built on top of the framework detect accidentally swapped function arguments, incorrect binary operators, and incorrect operands in binary operations. Applying the approach to a corpus of 150,000 JavaScript files yields bug detectors that have a high accuracy (between 89% and 95%), are very efficient (less than 20 milliseconds per analyzed file), and reveal 102 programming mistakes (with 68% true positive rate) in real-world code. © 2018 Copyright held by the owner/author(s).
引用
收藏
相关论文
共 50 条
  • [21] Resource Name-Based Routing in the Network Layer
    Hwang, Haesung
    Ata, Shingo
    Murata, Masayuki
    JOURNAL OF NETWORK AND SYSTEMS MANAGEMENT, 2014, 22 (01) : 1 - 22
  • [22] Name-based Routing in Virtual Automation Networks
    Messerschmidt, Ralf
    Neumann, Peter
    Lindemann, Lars
    AUTOMATION 2010, 2010, : 445 - 448
  • [23] The quality of name-based species records in databases
    Santos, Antonio M.
    Branco, Madalena
    TRENDS IN ECOLOGY & EVOLUTION, 2012, 27 (01) : 6 - 7
  • [24] Hierarchical Name-based Route Aggregation Scheme
    Xu Z.-W.
    Chen B.
    Zhang Y.-J.
    Ruan Jian Xue Bao/Journal of Software, 2019, 30 (02): : 381 - 398
  • [25] SNGR: Scalable Name-Based Geometric Routing for ICN
    Sun, Yanbin
    Zhang, Yu
    Fang, Binxing
    Zhang, Hongli
    IEICE TRANSACTIONS ON COMMUNICATIONS, 2016, E99B (08) : 1835 - 1845
  • [26] Using name-based mappings to increase hit rates
    Thaler, DG
    Ravishankar, CV
    IEEE-ACM TRANSACTIONS ON NETWORKING, 1998, 6 (01) : 1 - 14
  • [27] CANR: CACHE-AWARE NAME-BASED ROUTING
    Hu, Xiaoyan
    Gong, Jian
    2014 IEEE 3rd International Conference on Cloud Computing and Intelligence Systems (CCIS), 2014, : 212 - 217
  • [28] Name-based demographic inference and the unequal distribution of misrecognition
    Lockhart, Jeffrey. W. W.
    King, Molly. M. M.
    Munsch, Christin
    NATURE HUMAN BEHAVIOUR, 2023, 7 (07) : 1084 - +
  • [29] Name-Based Address Mapping for Virtual Private Networks
    Suranyi, Peter
    Shinjo, Yasushi
    Kato, Kazuhiko
    IEICE TRANSACTIONS ON COMMUNICATIONS, 2009, E92B (01) : 200 - 208
  • [30] Name-Based Analysis of Equally Typed Method Arguments
    Pradel, Michael
    Gross, Thomas R.
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2013, 39 (08) : 1127 - 1143