FMCF: A fusing multiple code features approach based on Transformer for Solidity smart contracts source code summarization

被引:0
|
作者
Lei, Gang [1 ]
Zhang, Donghua [2 ]
Xiao, Jianmao [1 ]
Fan, Guodong [3 ]
Cao, Yuanlong [1 ]
Feng, Zhiyong [3 ]
机构
[1] Jiangxi Normal Univ, Sch Software, Nanchang 330022, Peoples R China
[2] Jiangxi Normal Univ, Sch Digital Ind, Shangrao 334000, Peoples R China
[3] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300350, Peoples R China
关键词
Solidity smart contracts; Source code summarization; Feature fusion; Transformer; Structure-based traversal; Manual evaluation; NATURAL-LANGUAGE SUMMARIES;
D O I
10.1016/j.asoc.2024.112238
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A smart contract is a software program executed on a blockchain, designed to facilitate functionalities such as contract execution, asset administration, and identity validation within a secure and decentralized ecosystem. Summarizing the code of Solidity smart contracts aids developers in promptly grasping essential functionalities, thereby enhancing the security posture of Ethereum-based projects. Existing smart contract code summarization works mainly use traditional information retrieval and single code features, resulting in suboptimal performance. In this study, we propose a fusing multiple code features (FMCF) approach based on Transformer for Solidity summarization. First, FMCF created contract integrity modeling and state immutability modeling in the data preprocessing stage to process and filter data that meets security conditions. At the same time, FMCF retains the self-attention mechanism to construct the Graph Attention Network (GAT) encoder and CodeBERT encoder, which respectively extract multiple feature vectors of the code to ensure the integrity of the source code information. Furthermore, the FMCF uses a weighted summation method to input these two types of feature vectors into the feature fusion module for fusion and inputs the fused feature vectors into the Transformer decoder to obtain the final smart contract code summarization. The experimental results show that FMCF outperforms the standard baseline methods by 12.45% in the BLEU score and maximally preserves the semantic information and syntax structures of the source code. The results demonstrate that the FMCF can provide a good direction for future research on smart contract code summarization, thereby helping developers enhance the security of development projects.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] A Multi-Modal Transformer-based Code Summarization Approach for Smart Contracts
    Yang, Zhen
    Keung, Jacky
    Yu, Xiao
    Gu, Xiaodong
    Wei, Zhengyuan
    Ma, Xiaoxue
    Zhang, Miao
    2021 IEEE/ACM 29TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC 2021), 2021, : 1 - 12
  • [2] FCSO: Source Code Summarization by Fusing Multiple Code Features and Ensuring Self-consistency Output
    Zhang, Donghua
    Lei, Gang
    Xiao, Jianmao
    Xu, Zhipeng
    Fan, Guodong
    Chen, Shizhan
    Cao, Yuanlong
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2023, PT II, 2024, 14488 : 112 - 129
  • [3] Demystifying the Composition and Code Reuse in Solidity Smart Contracts
    Sun, Kairan
    Xu, Zhengzi
    Liu, Chengwei
    Li, Kaixuan
    Liu, Yang
    PROCEEDINGS OF THE 31ST ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2023, 2023, : 796 - 807
  • [4] Code Structure-Guided Transformer for Source Code Summarization
    Gao, Shuzheng
    Gao, Cuiyun
    He, Yulan
    Zeng, Jichuan
    Nie, Lunyiu
    Xia, Xin
    Lyu, Michael
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2023, 32 (01)
  • [5] Source Code Obfuscation for Smart Contracts
    Zhang, Meng
    Zhang, Pengcheng
    Luo, Xiapu
    Xiao, Feng
    2020 27TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2020), 2020, : 513 - 514
  • [6] An Extensive Study of the Structure Features in Transformer-based Code Semantic Summarization
    Yang, Kang
    Mao, Xinjun
    Wang, Shangwen
    Qin, Yihao
    Zhang, Tanghaoran
    Lu, Yao
    Al-Sabahi, Kamal
    2023 IEEE/ACM 31ST INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION, ICPC, 2023, : 89 - 100
  • [7] Source Code Summarization with Structural Relative Position Guided Transformer
    Gong, Zi
    Gao, Cuiyun
    Wang, Yasheng
    Gu, Wenchao
    Peng, Yun
    Xu, Zenglin
    2022 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2022), 2022, : 13 - 24
  • [8] Analysis of Source Code Duplication in Ethreum Smart Contracts
    Pierro, Giuseppe Antonio
    Tonelli, Roberto
    2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2021), 2021, : 701 - 707
  • [9] Code Parameter Summarization Based on Transformer and Fusion Strategy
    Zhang, Fanlong
    Fan, Jiancheng
    Li, Weiqi
    Khoo, Siau-cheng
    IET SOFTWARE, 2024, 2024
  • [10] Keyword-Based Source Code Summarization
    Zhang S.
    Xie R.
    Ye W.
    Hen L.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2020, 57 (09): : 1987 - 2000