Software Defect Prediction Based on Deep Representation Learning of Source Code From Contextual Syntax and Semantic Graph

被引:9
|
作者
Abdu, Ahmed [1 ]
Zhai, Zhengjun [1 ]
Abdo, Hakim A. [2 ]
Algabri, Redhwan [3 ]
机构
[1] Northwestern Polytech Univ, Sch Software, Xian 710072, Peoples R China
[2] Hodeidah Univ, Dept Comp Sci, Al Hudaydah 3114, Yemen
[3] Hanyang Univ, Res Inst Engn & Technol, Ansan 15588, South Korea
关键词
Contextual representations; deep learning; graphical representations; hierarchical convolutional neural network; software defect prediction (SDP); MODEL;
D O I
10.1109/TR.2024.3354965
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Software defect prediction approaches play an essential role in the software development life cycle to help developers predict defects early, thus, preventing wasted time and effort. Defect prediction techniques based on semantic features have recently gained success over approaches based on traditional features. Existing semantic features-based defect prediction approaches use a single source code representation. Most studies focus on contextual syntax represented by abstract syntax trees, and some studies use a control flow graph to represent code graphs. However, a single representation is still limited for predicting defects that call multiple functions and have a high probability of false positives. To close the gap between source code representations on software defect prediction, we propose a defect prediction model based on multiple source code representations. The proposed model is a deep hierarchical convolutional neural network (DH-CNN). The syntax features extracted from abstract syntax trees using Word2vec are fed into syntax-level DH-CNN, and the semantic-graph features extracted from the control flow graph and data dependence graph using Node2vec are fed into semantic-level DH-CNN. In addition, the proposed model includes a gated merging mechanism that combines DH-CNN outputs to estimate the combination ratio of both types of features. Experimental results indicate that DH-CNN outperforms existing methods under cross-project and within-project scenarios.
引用
收藏
页码:820 / 834
页数:15
相关论文
共 50 条
  • [1] Semantic feature learning for software defect prediction from source code and external knowledge
    Liu, Jingyu
    Ai, Jun
    Lu, Minyan
    Wang, Jie
    Shi, Haoxiang
    JOURNAL OF SYSTEMS AND SOFTWARE, 2023, 204
  • [2] Deep Learning-Based Software Defect Prediction via Semantic Key Features of Source Code-Systematic Survey
    Abdu, Ahmed
    Zhai, Zhengjun
    Algabri, Redhwan
    Abdo, Hakim A.
    Hamad, Kotiba
    Al-antari, Mugahed A.
    MATHEMATICS, 2022, 10 (17)
  • [3] Deep Semantic Feature Learning for Software Defect Prediction
    Wang, Song
    Liu, Taiyue
    Nam, Jaechang
    Tan, Lin
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2020, 46 (12) : 1267 - 1293
  • [4] Software Defect Prediction Using a Hybrid Model Based on Semantic Features Learned from the Source Code
    Miholca, Diana-Lucia
    Czibula, Gabriela
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT I, 2019, 11775 : 262 - 274
  • [5] Code Multiview Hypergraph Representation Learning for Software Defect Prediction
    Qiu, Shaojian
    Huang, Mengyang
    Liang, Yun
    Peng, Chaoda
    Yuan, Yuan
    IEEE TRANSACTIONS ON RELIABILITY, 2024, : 1 - 14
  • [6] Source code-based defect prediction using deep learning and transfer learning
    Saifan, Ahmad A.
    Al Smadi, Nawzat
    INTELLIGENT DATA ANALYSIS, 2019, 23 (06) : 1243 - 1269
  • [7] Learning Semantic Features for Software Defect Prediction by Code Comments Embedding
    Huo, Xuan
    Yang, Yang
    Li, Ming
    Zhan, De-Chuan
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 1049 - 1054
  • [8] LDFR: Learning deep feature representation for software defect prediction
    Xu, Zhou
    Li, Shuai
    Xu, Jun
    Liu, Jin
    Luo, Xiapu
    Zhang, Yifeng
    Zhang, Tao
    Keung, Jacky
    Tang, Yutian
    JOURNAL OF SYSTEMS AND SOFTWARE, 2019, 158
  • [9] Deep learning based software defect prediction
    Qiao, Lei
    Li, Xuesong
    Umer, Qasim
    Guo, Ping
    NEUROCOMPUTING, 2020, 385 : 100 - 110
  • [10] Source Code Defect Detection Based on Deep Learning
    Wang X.-M.
    Zhang T.
    Xin W.
    Hou C.-Y.
    Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2019, 39 (11): : 1155 - 1159