A Diversified Feature Extraction Approach for Program Similarity Analysis

被引:1
|
作者
Wang, Ying [1 ]
Jin, Dahai [2 ]
Gong, Yunzhan [3 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 18311026809, Peoples R China
[2] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 13020034471, Peoples R China
[3] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 61198028, Peoples R China
基金
中国国家自然科学基金;
关键词
Similarity detection; code plagiarism; feature extraction;
D O I
10.1145/3305160.3305189
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As code plagiarism becomes more and more prevalent, the need for code similarity detection technology is growing greatly. The feature of program is the basic unit that can represent the procedure and structure. Therefore, the quality of the feature will directly impact the accuracy of the similarity detection results. In this paper, we propose a diversified feature extraction approach, which extracts feature information from attribute counting, statement structure, program structure and program function. In the process of feature extraction, we comprehensively consider multiple factors of program, such as program structure, semantics and data flow. Evaluation results shows that this approach can eliminate the interference caused by multiple plagiarism methods, and it also has certain improvement in accuracy and detection efficiency.
引用
收藏
页码:96 / 101
页数:6
相关论文
共 50 条
  • [41] A Novel Approach for MFCC Feature Extraction
    Hossan, Md Afzal
    Memon, Sheeraz
    Gregory, Mark A.
    2010 4TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS), 2010,
  • [42] A feature-based approach to assessing advertisement similarity
    Schweidel, DA
    Bradlow, ET
    Williams, P
    JOURNAL OF MARKETING RESEARCH, 2006, 43 (02) : 237 - 243
  • [43] Urban Morphological Feature Extraction and Multi-Dimensional Similarity Analysis Based on Deep Learning Approaches
    Cai, Chenyi
    Guo, Zifeng
    Zhang, Baizhou
    Wang, Xiao
    Li, Biao
    Tang, Peng
    SUSTAINABILITY, 2021, 13 (12)
  • [44] Feature extraction in image analysis
    Umbaugh, SE
    Wei, YS
    Zuke, M
    IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE, 1997, 16 (04): : 62 - 73
  • [45] Semantic similarity based feature extraction from microarray expression data
    Cho, Young-Rae
    Zhang, Aidong
    Xu, Xian
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2009, 3 (03) : 333 - 345
  • [46] Combining feature extraction and expansion to improve classification based similarity learning
    Lopez-Inesta, Emilia
    Grimaldo, Francisco
    Arevalillo-Herraez, Miguel
    PATTERN RECOGNITION LETTERS, 2017, 93 : 95 - 103
  • [47] Novel Confidence Feature Extraction Algorithm Based on Latent Topic Similarity
    Chen, Wei
    Liu, Gang
    Guo, Jun
    Omachi, Shinichiro
    Omachi, Masako
    Guo, Yujing
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (08): : 2243 - 2251
  • [48] An image similarity invariant feature extraction method based on radon transform
    Guo, Hongjun
    Chen, Lili
    International Journal of Circuits, Systems and Signal Processing, 2021, 15 : 288 - 296
  • [49] FEATURE ANALYSIS AND THE ROLE OF SIMILARITY IN PREATTENTIVE VISION
    NOTHDURFT, HC
    PERCEPTION & PSYCHOPHYSICS, 1992, 52 (04): : 355 - 375
  • [50] Hybrid Independent Component Analysis and Rough Set Approach for Audio Feature Extraction
    Xin He
    Ling Guo
    Jianyu Wang
    Xianzhong Zhou
    PROCEEDINGS OF THE 2008 CHINESE CONFERENCE ON PATTERN RECOGNITION (CCPR 2008), 2008, : 412 - +