Similarity Code File Detection Model Based on Frequent Itemsets

被引:0
|
作者
Jiang, Jian-hong [1 ]
Wang, Ke [1 ]
机构
[1] Guilin Univ Elect Technol, Coll Business, Guilin 541004, Peoples R China
关键词
Source code similarity; Frequent itemset; Association rule; IN-SOURCE CODE;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In order to improve the efficiency and accuracy of program source code similarity detection, an improvement on the method of code detection is made according to some deficiencies of the current research. A similar code detection model based on frequent item sets is proposed. The model constructs frequent items set data to discover repetitive code collections and automatically divide file similarity attribution. The algorithm model does not need to consider the type of the code in the detection process, and has wide applicability, not only can detect the code files of different programming languages and grammars, but also can mark out similar codes and statistic the results. Simultaneously, through experimental comparison, it is proved that the model has high accuracy and processing efficiency.
引用
收藏
页码:254 / 262
页数:9
相关论文
共 50 条
  • [21] An Efficient Mining Model for Global Frequent Closed Itemsets
    Lin, Jianming
    Ju, Chunhua
    Liu, Dongsheng
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, VOL II, 2009, : 278 - 282
  • [22] Discovery of Frequent Itemsets: Frequent Item Tree-Based Approach
    Kumar, A. V. Senthil
    Wahidabanu, R. S. D.
    JOURNAL OF ICT RESEARCH AND APPLICATIONS, 2007, 1 (01) : 42 - 55
  • [23] A Pipeline Model to Discover Frequent Itemsets in an Hierarchical Systems
    Arour, Khedija
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2013, 6 (04): : 19 - 38
  • [24] Online data stream mining of recent frequent itemsets based on sliding window model
    Ren, Jia-Dong
    Li, Ke
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 293 - 298
  • [25] Efficiently algorithm based on itemsets-lattice and bitmap index for finding frequent itemsets
    Chen, Fu-Zan
    Li, Min-Qiang
    Xitong Gongcheng Lilun yu Shijian/System Engineering Theory and Practice, 2008, 28 (02): : 26 - 34
  • [26] Frequent itemsets mining algorithm based on sort tree
    Wang H.-M.
    Dang Y.-Y.
    Hu M.
    Liu D.-Y.
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2016, 46 (04): : 1216 - 1221
  • [27] Clustering categorical data based on maximal frequent itemsets
    Yu, Dadong
    Liu, Dongbo
    Luo, Rui
    Wang, Jianxin
    ICMLA 2007: SIXTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2007, : 93 - +
  • [28] Mining frequent itemsets with category-based constraints
    Do, TD
    Hui, SC
    Fong, A
    DISCOVERY SCIENCE, PROCEEDINGS, 2003, 2843 : 76 - 86
  • [29] Parallel SAT Based Closed Frequent Itemsets Enumeration
    Dlala, Imen Ouled
    Jabbour, Said
    Sais, Lakhdar
    Salhi, Yakoub
    Ben Yaghlane, Boutheina
    2015 IEEE/ACS 12TH INTERNATIONAL CONFERENCE OF COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2015,
  • [30] An efficient algorithm of frequent itemsets mining based on MapReduce
    Wang, Le
    Feng, Lin
    Zhang, Jing
    Liao, Pengyu
    Journal of Information and Computational Science, 2014, 11 (08): : 2809 - 2816