BREAKING THE CLOSED-WORLD ASSUMPTION IN STYLOMETRIC AUTHORSHIP ATTRIBUTION

被引:0
|
作者
Stolerman, Ariel
Overdorf, Rebekah
Afroz, Sadia
Greenstadt, Rachel
机构
来源
关键词
Forensic stylometry; authorship attribution; authorship verification; CLASSIFICATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Stylometry is a form of authorship attribution that relies on the linguistic information found in a document. While there has been significant work in stylometry, most research focuses on the closed-world problem where the author of the document is in a known suspect set. For open-world problems where the author may not be in the suspect set, traditional classification methods are ineffective. This paper proposes the "classify-verify" method that augments classification with a binary verification step evaluated on stylometric datasets. This method, which can be generalized to any domain, significantly outperforms traditional classifiers in open-world settings and yields an F1-score of 0.87, comparable to traditional classifiers in closed-world settings. Moreover, the method successfully detects adversarial documents where authors deliberately change their styles, a problem for which closed-world classifiers fail.
引用
收藏
页码:185 / 205
页数:21
相关论文
共 50 条
  • [1] Fusion under the closed-world assumption
    Grégoire, E
    [J]. SENSOR FUSION: ARCHITECTURES, ALGORITHMS AND APPLICATIONS V, 2001, 4385 : 197 - 204
  • [2] Evaluation of queries under closed-world assumption
    Suchenek, MA
    [J]. JOURNAL OF AUTOMATED REASONING, 1997, 18 (03) : 357 - 398
  • [3] SATURATION, NONMONOTONIC REASONING AND THE CLOSED-WORLD ASSUMPTION
    BOSSU, G
    SIEGEL, P
    [J]. ARTIFICIAL INTELLIGENCE, 1985, 25 (01) : 13 - 63
  • [4] Evaluation of Queries under Closed-World Assumption
    Marek A. Suchenek
    [J]. Journal of Automated Reasoning, 1997, 18 : 357 - 398
  • [5] Efficient reasoning using the local closed-world assumption
    Doherty, P
    Lukaszewicz, W
    Szalas, A
    [J]. ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, APPLICATIONS, PROCEEDINGS, 2000, 1904 : 49 - 58
  • [6] On the local closed-world assumption of data-sources
    Cortés-Calabuig, A
    Denecker, M
    Arieli, O
    Van Nuffelen, B
    Bruynooghe, M
    [J]. LOGIC PROGRAMMING AND NONMONOTONIC REASONING, 2005, 3662 : 145 - 157
  • [7] Stylometric Authorship Attribution of Collaborative Documents
    Dauber, Edwin
    Overdorf, Rebekah
    Greenstadt, Rachel
    [J]. CYBER SECURITY CRYPTOGRAPHY AND MACHINE LEARNING (CSCML 2017), 2017, 10332 : 115 - 135
  • [8] The Effectiveness of Stemming in the Stylometric Authorship Attribution in Arabic
    Omar, Abdulfattah
    Hamouda, Wafya Ibrahim
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (01) : 116 - 121
  • [9] A Stylometric Analysis on Bengali Literature For Authorship Attribution
    Hossain, M. Tahmid
    Rahman, Md Moshiur
    Ismail, Sabir
    Islam, Md Saiful
    [J]. 2017 20TH INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2017,
  • [10] Stylometric Features for Authorship Attribution of Polish Texts
    Szwed, Piotr
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2017, PT II, 2017, 10246 : 171 - 182