Cross-Language Authorship Attribution

被引:0
|
作者
Bogdanova, Dasha [1 ]
Lazaridou, Angeliki [2 ]
机构
[1] Dublin City Univ, Sch Comp, CNGL Ctr Global Intelligent Content, Dublin 9, Ireland
[2] Univ Trent, Ctr Mind Brain Sci, I-38100 Trento, Italy
来源
LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2014年
关键词
Cross-Language Techniques; Authorship Attribution; Text Classification;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This paper presents a novel task of cross-language authorship attribution (CLAA), an extension of authorship attribution task to multilingual settings: given data labelled with authors in language X, the objective is to determine the author of a document written in language Y, where X not equal Y. We propose a number of cross-language stylometric features for the task of CLAA, such as those based on sentiment and emotional markers. We also explore an approach based on machine translation (MT) with both lexical and cross-language features. We experimentally show that MT could be used as a starting point to CLAA, since it allows good attribution accuracy to be achieved. The cross-language features provide acceptable accuracy while using jointly with MT, though do not outperform lexical features.
引用
收藏
页码:2015 / 2020
页数:6
相关论文
共 50 条
  • [1] Language models and fusion for authorship attribution
    Fourkioti, Olga
    Symeonidis, Symeon
    Arampatzis, Avi
    INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (06)
  • [2] Distributed language representation for authorship attribution
    Kocher, Mirco
    Savoy, Jacques
    DIGITAL SCHOLARSHIP IN THE HUMANITIES, 2018, 33 (02) : 425 - 441
  • [3] Cross-Language Experiment
    Stastny, Jakub
    Sovka, Pavel
    RADIOENGINEERING, 2003, 12 (03) : 37 - 41
  • [4] CROSS-LANGUAGE PSYCHOLINGUISTICS
    CUTLER, A
    LINGUISTICS, 1985, 23 (05) : 659 - 667
  • [5] Effects of Language Processing in Turkish Authorship Attribution
    Agun, Hayri Volkan
    Yilmazel, Sibel
    Yilmazel, Ozgur
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 1876 - 1881
  • [6] Authorship Attribution on Short Texts in the Slovenian Language
    Gabrovsek, Gregor
    Peer, Peter
    Emersic, Ziga
    Batagelj, Borut
    APPLIED SCIENCES-BASEL, 2023, 13 (19):
  • [7] A Modified Language Modeling Method for Authorship Attribution
    Vazirian, Samane
    Zahedi, Morteza
    2016 EIGHTH INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE TECHNOLOGY (IKT), 2016, : 32 - 37
  • [8] Cross-Domain Authorship Attribution Using Pre-trained Language Models
    Barlas, Georgios
    Stamatatos, Efstathios
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2020, PT I, 2020, 583 : 255 - 266
  • [9] Language and cognition: A cross-language perspective
    Chen, HC
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 2004, 39 (5-6) : 148 - 148
  • [10] Cross-language comedy in Shakespeare
    Delabastita, D
    HUMOR-INTERNATIONAL JOURNAL OF HUMOR RESEARCH, 2005, 18 (02): : 161 - 184