Similarity Code File Detection Model Based on Frequent Itemsets

被引:0
|
作者
Jiang, Jian-hong [1 ]
Wang, Ke [1 ]
机构
[1] Guilin Univ Elect Technol, Coll Business, Guilin 541004, Peoples R China
关键词
Source code similarity; Frequent itemset; Association rule; IN-SOURCE CODE;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In order to improve the efficiency and accuracy of program source code similarity detection, an improvement on the method of code detection is made according to some deficiencies of the current research. A similar code detection model based on frequent item sets is proposed. The model constructs frequent items set data to discover repetitive code collections and automatically divide file similarity attribution. The algorithm model does not need to consider the type of the code in the detection process, and has wide applicability, not only can detect the code files of different programming languages and grammars, but also can mark out similar codes and statistic the results. Simultaneously, through experimental comparison, it is proved that the model has high accuracy and processing efficiency.
引用
收藏
页码:254 / 262
页数:9
相关论文
共 50 条
  • [31] A genetic algorithm based searching of maximal frequent itemsets
    Huang, JP
    Yang, CT
    Fu, CH
    IC-AI '04 & MLMTA'04 , VOL 1 AND 2, PROCEEDINGS, 2004, : 548 - 554
  • [32] DoS detections based on association rules and frequent itemsets
    Dept. of Computer Science and Engineering, Harbin Institute of Technology, Harbin 150001, China
    不详
    J. Harbin Inst. Technol., 2008, 2 (283-289):
  • [33] Mining fuzzy frequent itemsets based on UBFFP trees
    Lin, Chun-Wei
    Hong, Tzung-Pei
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2014, 27 (01) : 535 - 548
  • [34] ACCF: Associative Classification Based on Closed Frequent Itemsets
    Li, Xueming
    Qin, Dongxia
    Yu, Cun
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 380 - 384
  • [35] Clustering Transactions Based on Weighting Maximal Frequent Itemsets
    Huang, Faliang
    Xie, Guoqing
    Yao, Zhiqiang
    Cai, Shengzhen
    2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, : 262 - +
  • [36] Evaluation measures for frequent itemsets based on distributed representations
    Ozaki, Tomonobu
    2018 SIXTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR 2018), 2018, : 153 - 159
  • [37] Efficiently mining maximal frequent itemsets based on digraph
    Ren, Zhibo
    Zhang, Qiang
    Ma, Xiujuan
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2007, : 140 - +
  • [38] A regression-based algorithm for frequent itemsets mining
    Jia, Zirui
    Wang, Zengli
    DATA TECHNOLOGIES AND APPLICATIONS, 2020, 54 (03) : 259 - 273
  • [39] Cluster Based Partition Approach for Mining Frequent Itemsets
    Tiwari, Akhilesh
    Gupta, Rajendra K.
    Agrawal, Dev Prakash
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2009, 9 (06): : 191 - 199
  • [40] Research on Data stream Mining Algorithm for Frequent Itemsets Based on Sliding Window Model
    Wang, Hongmei
    Li, Fentian
    Tang, Dongkai
    Wang, Zeru
    2017 IEEE 2ND INTERNATIONAL CONFERENCE ON BIG DATA ANALYSIS (ICBDA), 2017, : 264 - 268