Similarity Code File Detection Model Based on Frequent Itemsets

被引:0
|
作者
Jiang, Jian-hong [1 ]
Wang, Ke [1 ]
机构
[1] Guilin Univ Elect Technol, Coll Business, Guilin 541004, Peoples R China
关键词
Source code similarity; Frequent itemset; Association rule; IN-SOURCE CODE;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In order to improve the efficiency and accuracy of program source code similarity detection, an improvement on the method of code detection is made according to some deficiencies of the current research. A similar code detection model based on frequent item sets is proposed. The model constructs frequent items set data to discover repetitive code collections and automatically divide file similarity attribution. The algorithm model does not need to consider the type of the code in the detection process, and has wide applicability, not only can detect the code files of different programming languages and grammars, but also can mark out similar codes and statistic the results. Simultaneously, through experimental comparison, it is proved that the model has high accuracy and processing efficiency.
引用
收藏
页码:254 / 262
页数:9
相关论文
共 50 条
  • [1] And code algorithm for discovering frequent itemsets
    Zhou, HY
    Zhang, Y
    THIRD INTERNATIONAL CONFERENCE ON ELECTRONIC COMMERCE ENGINEERING: DIGITAL ENTERPRISES AND NONTRADITIONAL INDUSTRIALIZATION, 2003, : 569 - 572
  • [2] Algorithm based on maximum weighted frequent itemsets for measuring database similarity
    Yang, Ming
    Sun, Zhi-Hui
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2004, 41 (10): : 1774 - 1779
  • [3] Frequent and Non-frequent Sequential Itemsets Detection
    Xylogiannopoulos, Konstantinos F.
    Karampelas, Panagiotis
    Alhajj, Reda
    FROM SOCIAL DATA MINING AND ANALYSIS TO PREDICTION AND COMMUNITY DETECTION, 2017, : 211 - 238
  • [4] Detection of Breast Cancer Based on Fuzzy Frequent Itemsets Mining
    Dhanaseelan, F. Ramesh
    Sutha, M. Jeya
    IRBM, 2021, 42 (03) : 198 - 206
  • [5] Mining Interesting Infrequent and Frequent Itemsets Based on MLMS Model
    Dong, Xiangjun
    Niu, Zhendong
    Zhu, Donghua
    Zheng, Zhiyun
    Jia, Qiuting
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2008, 5139 : 444 - +
  • [6] Mining maximal frequent itemsets for intrusion detection
    Wang, H
    Li, QH
    Xiong, HY
    Jiang, SY
    GRID AND COOPERATIVE COMPUTING GCC 2004 WORKSHOPS, PROCEEDINGS, 2004, 3252 : 422 - 429
  • [7] Class noise detection using frequent itemsets
    Van Hulse, Jason
    Khoshgoftaar, Taghi A.
    INTELLIGENT DATA ANALYSIS, 2006, 10 (06) : 487 - 507
  • [8] Mining updated frequent itemsets based on directed itemsets graph
    Wen Lei
    Li Min-qiang
    Proceedings of 2004 Chinese Control and Decision Conference, 2004, : 690 - 693
  • [9] Class noise detection using frequent itemsets
    Empirical Software Engineering Laboratory, Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL 33431, United States
    Intell. Data Anal., 2006, 6 (487-507):
  • [10] Mining maximum frequent itemsets based on directed itemsets graph
    Wen Lei
    PROCEEDINGS OF 2004 CHINESE CONTROL AND DECISION CONFERENCE, 2004, : 681 - 683