FMCF: A fusing multiple code features approach based on Transformer for Solidity smart contracts source code summarization

被引:0
|
作者
Lei, Gang [1 ]
Zhang, Donghua [2 ]
Xiao, Jianmao [1 ]
Fan, Guodong [3 ]
Cao, Yuanlong [1 ]
Feng, Zhiyong [3 ]
机构
[1] Jiangxi Normal Univ, Sch Software, Nanchang 330022, Peoples R China
[2] Jiangxi Normal Univ, Sch Digital Ind, Shangrao 334000, Peoples R China
[3] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300350, Peoples R China
关键词
Solidity smart contracts; Source code summarization; Feature fusion; Transformer; Structure-based traversal; Manual evaluation; NATURAL-LANGUAGE SUMMARIES;
D O I
10.1016/j.asoc.2024.112238
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A smart contract is a software program executed on a blockchain, designed to facilitate functionalities such as contract execution, asset administration, and identity validation within a secure and decentralized ecosystem. Summarizing the code of Solidity smart contracts aids developers in promptly grasping essential functionalities, thereby enhancing the security posture of Ethereum-based projects. Existing smart contract code summarization works mainly use traditional information retrieval and single code features, resulting in suboptimal performance. In this study, we propose a fusing multiple code features (FMCF) approach based on Transformer for Solidity summarization. First, FMCF created contract integrity modeling and state immutability modeling in the data preprocessing stage to process and filter data that meets security conditions. At the same time, FMCF retains the self-attention mechanism to construct the Graph Attention Network (GAT) encoder and CodeBERT encoder, which respectively extract multiple feature vectors of the code to ensure the integrity of the source code information. Furthermore, the FMCF uses a weighted summation method to input these two types of feature vectors into the feature fusion module for fusion and inputs the fused feature vectors into the Transformer decoder to obtain the final smart contract code summarization. The experimental results show that FMCF outperforms the standard baseline methods by 12.45% in the BLEU score and maximally preserves the semantic information and syntax structures of the source code. The results demonstrate that the FMCF can provide a good direction for future research on smart contract code summarization, thereby helping developers enhance the security of development projects.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] CLG-Trans: Contrastive learning for code summarization via graph attention-based transformer
    Zeng, Jianwei
    He, Yutong
    Zhang, Tao
    Xu, Zhou
    Han, Qiang
    SCIENCE OF COMPUTER PROGRAMMING, 2023, 226
  • [42] DG-Trans: Automatic Code Summarization via Dynamic Graph Attention-based Transformer
    Zeng, Jianwei
    Zhang, Tao
    Xu, Zhou
    2021 IEEE 21ST INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY (QRS 2021), 2021, : 786 - 795
  • [43] Robust Vulnerability Detection in Solidity-Based Ethereum Smart Contracts Using Fine-Tuned Transformer Encoder Models
    Le, Thi-Thu-Huong
    Kim, Jaehyun
    Lee, Sangmyeong
    Kim, Howon
    IEEE ACCESS, 2024, 12 : 154700 - 154717
  • [44] BLOCSUM: Block Scope-based Source Code Summarization via Shared Block Representation
    Choi, YunSeok
    Kim, Hyojun
    Lee, Jee-Hyong
    Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2023, : 11427 - 11441
  • [45] Android Authorship Attribution Using Source Code-Based Features
    Aydogan, Emre
    Sen, Sevil
    IEEE ACCESS, 2024, 12 : 6569 - 6589
  • [46] Unsupervised Learning Approach for Clustering Source Code based on Functionalities
    Ifham, Mohamed
    Kumara, B. T. G. S.
    Kuhaneswaran, Banujan
    2021 INTERNATIONAL CONFERENCE ON DECISION AID SCIENCES AND APPLICATION (DASA), 2021,
  • [47] An approach for mapping features to code based on static and dynamic analysis
    Rohatgi, Abhishek
    Hamou-Lhadj, Abdelwahab
    Rilling, Juergen
    PROCEEDINGS OF THE 16TH IEEE INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION, 2008, : 234 - 239
  • [48] An Automatic Source Code Vulnerability Detection Approach Based on KELM
    Tang, Gaigai
    Yang, Lin
    Ren, Shuangyin
    Meng, Lianxiao
    Yang, Feng
    Wang, Huiqiang
    SECURITY AND COMMUNICATION NETWORKS, 2021, 2021
  • [49] Towards Retrieval-Based Neural Code Summarization: A Meta-Learning Approach
    Zhou, Ziyi
    Yu, Huiqun
    Fan, Guisheng
    Huang, Zijie
    Yang, Kang
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (04) : 3008 - 3031
  • [50] BLOCSUM: Block Scope-based Source Code Summarization via Shared Block Representation
    Choi, YunSeok
    Kim, Hyojun
    Lee, Jee-Hyong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 11427 - 11441