Nomen est Omen - The Role of Signatures in Ascribing Email Author Identity with Transformer Neural Networks

被引:1
|
作者
Srinivasan, Sudarshan [1 ]
Begoli, Edmon [1 ]
Mahbub, Maria [1 ]
Knight, Kathryn [1 ]
机构
[1] Oak Ridge Natl Lab, Oak Ridge, TN 37830 USA
关键词
natural language processing; authorship attribution; transformer-based networks; attention-based models; adversarial perturbation; digital forensics; ATTRIBUTION;
D O I
10.1109/SPW53761.2021.00049
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Authorship attribution, an NLP problem where anonymous text is matched to its author, has important, cross-disciplinary applications, particularly those concerning cyber-defense. Our research examines the degree of sensitivity that attention-based models have to adversarial perturbations. We ask, what is the minimal amount of change necessary to maximally confuse a transformer model? In our investigation we examine a balanced subset of emails from the Enron email dataset, calculating the performance of our model before and after email signatures have been perturbed. Results show that the model's performance changed significantly in the absence of a signature, indicating the importance of email signatures in email authorship detection. Furthermore, we show that these models rely on signatures for shorter emails much more than for longer emails. We also indicate that additional research is necessary to investigate stylometric features and adversarial training to further improve classification model robustness.
引用
收藏
页码:291 / 297
页数:7
相关论文
共 6 条