Similarity Code File Detection Model Based on Frequent Itemsets

被引:0
|
作者
Jiang, Jian-hong [1 ]
Wang, Ke [1 ]
机构
[1] Guilin Univ Elect Technol, Coll Business, Guilin 541004, Peoples R China
关键词
Source code similarity; Frequent itemset; Association rule; IN-SOURCE CODE;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In order to improve the efficiency and accuracy of program source code similarity detection, an improvement on the method of code detection is made according to some deficiencies of the current research. A similar code detection model based on frequent item sets is proposed. The model constructs frequent items set data to discover repetitive code collections and automatically divide file similarity attribution. The algorithm model does not need to consider the type of the code in the detection process, and has wide applicability, not only can detect the code files of different programming languages and grammars, but also can mark out similar codes and statistic the results. Simultaneously, through experimental comparison, it is proved that the model has high accuracy and processing efficiency.
引用
收藏
页码:254 / 262
页数:9
相关论文
共 50 条
  • [41] Real Time Network File Similarity Detection Based on Approximate Matching
    Zhai, Aonan
    Xu, Fei
    Cao, Zigang
    Pan, Haiqing
    Li, Zhen
    Xiong, Gang
    2017 IEEE SECURITY AND PRIVACY WORKSHOPS (SPW 2017), 2017, : 223 - 228
  • [42] An efficiently algorithm based on itemsets-lattice and bitmap index for finding frequent itemsets
    Chen, FA
    Li, MQ
    FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 2, PROCEEDINGS, 2005, 3614 : 420 - 429
  • [43] DoS detections based on association rules and frequent itemsets
    George S Oreku
    Fredrick JMtenzi
    李建中
    Journal of Harbin Institute of Technology, 2008, (02) : 283 - 289
  • [44] Signature-based Tree for Finding Frequent Itemsets
    Benelhadj, Mohamed El Hadi
    Deye, Mohamed Mahmoud
    Slimani, Yahya
    JOURNAL OF COMMUNICATIONS SOFTWARE AND SYSTEMS, 2023, 19 (01) : 70 - 80
  • [45] Mining frequent itemsets Algorithm Based on Compression Matrix
    Lin, Zizhi
    Shu, Sihui
    MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 3501 - 3505
  • [46] A frequent itemsets mining algorithm based on Spatial Partition
    Liu, Tieying
    Chen, Lirong
    Wang, Guoguang
    2009 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, VOL 5, 2009, : 339 - +
  • [47] An efficient maximal frequent itemsets mining algorithm - Based on frequent pattern tree
    Xue, XR
    Wang, GY
    Wu, Y
    Yang, SX
    DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2005, 1 : 176 - 181
  • [48] Compressed Bitmaps Based Frequent Itemsets Mining on Hadoop
    Saeed, Aref A.
    Rauf, Azhar
    Khusro, Shah
    Mahfooz, Saeed
    INTERNATIONAL CONFERENCE ON INFORMATICS AND SYSTEMS (INFOS 2016), 2016, : 159 - 165
  • [49] Mining maximal frequent itemsets by a boolean based approach
    Salleb, A
    Maazouzi, Z
    Vrain, C
    ECAI 2002: 15TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2002, 77 : 385 - 389
  • [50] Binary Code Similarity Detection
    Liu, Zian
    2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING ASE 2021, 2021, : 1056 - 1060