Authorship attribution in the wild

被引:0
|
作者
Moshe Koppel
Jonathan Schler
Shlomo Argamon
机构
[1] Bar-Ilan University,
[2] Illinois Institute of Technology,undefined
来源
关键词
Authorship attribution; Open candidate set; Randomized feature set;
D O I
暂无
中图分类号
学科分类号
摘要
Most previous work on authorship attribution has focused on the case in which we need to attribute an anonymous document to one of a small set of candidate authors. In this paper, we consider authorship attribution as found in the wild: the set of known candidates is extremely large (possibly many thousands) and might not even include the actual author. Moreover, the known texts and the anonymous texts might be of limited length. We show that even in these difficult cases, we can use similarity-based methods along with multiple randomized feature sets to achieve high precision. Moreover, we show the precise relationship between attribution precision and four parameters: the size of the candidate set, the quantity of known-text by the candidates, the length of the anonymous text and a certain robustness score associated with a attribution.
引用
收藏
页码:83 / 94
页数:11
相关论文
共 50 条
  • [1] Authorship attribution in the wild
    Koppel, Moshe
    Schler, Jonathan
    Argamon, Shlomo
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2011, 45 (01) : 83 - 94
  • [2] AUTHORSHIP ATTRIBUTION
    HOLMES, DI
    [J]. COMPUTERS AND THE HUMANITIES, 1994, 28 (02): : 87 - 106
  • [3] Versification and Authorship Attribution
    Gomez Camelo, Laura Camila
    Munoz Landinez, Valeria
    [J]. LITERATURA-TEORIA HISTORIA CRITICA, 2023, 25 (02): : 308 - 315
  • [4] Championing authorship attribution
    不详
    [J]. NATURE CELL BIOLOGY, 2017, 19 (06) : 579 - 579
  • [5] Authorship Attribution and Pastiche
    Harold Somers
    Fiona Tweedie
    [J]. Computers and the Humanities, 2003, 37 : 407 - 429
  • [6] Authorship Attribution System
    Marchenko, Oleksandr
    Anisimov, Anatoly
    Nykonenko, Andrii
    Rossada, Tetiana
    Melnikov, Egor
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2017, 2017, 10260 : 227 - 231
  • [7] Authorship attribution and pastiche
    Somers, H
    Tweedie, F
    [J]. COMPUTERS AND THE HUMANITIES, 2003, 37 (04): : 407 - 429
  • [8] Automatic authorship attribution
    Stamatatos, E
    Fakotakis, N
    Kokkinakis, G
    [J]. NINTH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS, 1999, : 158 - 164
  • [9] Versification and Authorship Attribution
    Macutek, Jan
    [J]. CESKA LITERATURA, 2022, 70 (06): : 773 - 777
  • [10] Championing authorship attribution
    [J]. Nature Cell Biology, 2017, 19 : 579 - 579