Machine Learning Approaches for Authorship Attribution using Source Code Stylometry

被引:3
|
作者
Frankel, Sophia F. [1 ]
Ghosh, Krishnendu [1 ]
机构
[1] Coll Charleston, Dept Comp Sci, Charleston, SC 29401 USA
关键词
Authorship Attribution; Stylometry; Source Code; Machine Learning; PLAGIARISM DETECTION;
D O I
10.1109/BigData52589.2021.9671332
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Identification of source code authorship is vital for attribution. In this work, a machine learning framework is described to identify source code authorship. The framework integrates the features extracted using natural language processing based approaches and abstract syntax tree of the code. We evaluate the methodology on Google Code Jam dataset. We present the performance measures of the logistic regression and deep learning on the dataset.
引用
收藏
页码:3298 / 3304
页数:7
相关论文
共 50 条
  • [1] Authorship Attribution Using Stylometry and Machine Learning Techniques
    Ramnial, Hoshiladevi
    Panchoo, Shireen
    Pudaruth, Sameerchand
    [J]. INTELLIGENT SYSTEMS TECHNOLOGIES AND APPLICATIONS, VOL 1, 2016, 384 : 113 - 125
  • [2] Misleading Authorship Attribution of Source Code using Adversarial Learning
    Quiring, Erwin
    Maier, Alwin
    Rieck, Konrad
    [J]. PROCEEDINGS OF THE 28TH USENIX SECURITY SYMPOSIUM, 2019, : 479 - 496
  • [3] Authorship Attribution in Latin Languages using Stylometry
    Varela, P.
    Albonico, M.
    Justino, E.
    Assis, J.
    [J]. IEEE LATIN AMERICA TRANSACTIONS, 2020, 18 (04) : 729 - 735
  • [4] On Improving Authorship Attribution of Source Code
    Tennyson, Matthew F.
    [J]. DIGITAL FORENSICS AND CYBER CRIME, ICDF2C 2012, 2013, 114 : 58 - 65
  • [5] Source code authorship attribution using n-grams
    Burrows, Steven
    Tahaghoghi, S.M.M.
    [J]. ADCS 2007 - Proceedings of the Twelfth Australasian Document Computing Symposium, 2007, : 32 - 39
  • [6] Stylometry and Authorship Attribution: Introduction to the Special Issue
    Calle-Martin, Javier
    Miranda-Garcia, Antonio
    [J]. ENGLISH STUDIES, 2012, 93 (03) : 251 - 258
  • [7] Authorship attribution in twitter: a comparative study of machine learning and deep learning approaches
    Rebeh Imane Ammar Aouchiche
    Fatima Boumahdi
    Mohamed Abdelkarim Remmide
    Amina Madani
    [J]. International Journal of Information Technology, 2024, 16 (5) : 3303 - 3310
  • [8] Comparing techniques for authorship attribution of source code
    Burrows, Steven
    Uitdenbogerd, Alexandra L.
    Turpin, Andrew
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 2014, 44 (01): : 1 - 32
  • [9] Analysis of Source Code Authorship Attribution Problem
    Bogdanova, Alina
    Farina, Mirko
    Kholmatova, Zamira
    Kruglov, Artem
    Romanov, Vitaly
    Succi, Giancarlo
    [J]. 2022 INTERNATIONAL CONFERENCE ON COMPUTERS AND ARTIFICIAL INTELLIGENCE TECHNOLOGIES, CAIT, 2022, : 109 - 115
  • [10] Android Authorship Attribution Using Source Code-Based Features
    Aydogan, Emre
    Sen, Sevil
    [J]. IEEE ACCESS, 2024, 12 : 6569 - 6589