Utilizing Latent Posting Style for Authorship Attribution on Short Texts

被引:2
|
作者
Leepaisomboon, Patamawadee [1 ]
Iwaihara, Mizuho [1 ]
机构
[1] Waseda Univ, Grad Sch Informat Prod & Syst, Fukuoka, Japan
关键词
Latent Dirichlet allocation; authorship attribution; sentiment; short text; twitter; support vector machine; social network;
D O I
10.1109/DASC/PiCom/CBDCom/CyberSciTech.2019.00184
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Character n-grams and word n-grams are the most widely used features for authorship attribution on short texts. In this paper, we propose a new method which exploits latent posting styles estimated from authors' short texts. The new posting style features characterize each user's posting style through sentiment orientation and post length. Concise hidden posting styles are captured by Latent Dirichlet Allocation (LDA), where we consider two types of LDA models. Then the vectors of latent posting styles are concatenated with averaged word embeddings of character n-grams and word n-grams, to be used to train a support vector machine. Our results show that combining latent posting styles with the traditional features can improve the accuracy of authorship attribution up to 5.2%.
引用
收藏
页码:1015 / 1022
页数:8
相关论文
共 50 条
  • [1] Contribution of Improved Character Embedding and Latent Posting Styles to Authorship Attribution of Short Texts
    Huang, Wenjing
    Su, Rui
    Iwaihara, Mizuho
    WEB AND BIG DATA, PT II, APWEB-WAIM 2020, 2020, 12318 : 261 - 269
  • [2] Authorship Attribution on Short Texts in the Slovenian Language
    Gabrovsek, Gregor
    Peer, Peter
    Emersic, Ziga
    Batagelj, Borut
    APPLIED SCIENCES-BASEL, 2023, 13 (19):
  • [3] A Comparative Survey of Authorship Attribution on Short Arabic Texts
    Ouamour, Siham
    Sayoud, Halim
    SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 479 - 489
  • [4] Authorship attribution of texts: A review
    Malyutov, M. B.
    GENERAL THEORY OF INFORMATION TRANSFER AND COMBINATORICS, 2006, 4123 : 362 - 380
  • [5] The Software for Authorship and Style Attribution
    Khomytska, Iryna
    Teslyuk, Vasyl
    2019 IEEE 15TH INTERNATIONAL CONFERENCE ON THE EXPERIENCE OF DESIGNING AND APPLICATION OF CAD SYSTEMS (CADSM'2019), 2019,
  • [6] Authorship Attribution of Short Historical Arabic Texts Based on Lexical Features
    Ouamour, S.
    Sayoud, H.
    2013 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY (CYBERC), 2013, : 144 - 147
  • [7] Authorship Attribution for Short Texts with Author-Document Topic Model
    Zhang, Haowen
    Nie, Peng
    Wen, Yanlong
    Yuan, Xiaojie
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2018), PT I, 2018, 11061 : 29 - 41
  • [8] Authorship attribution of short texts using multi-layer perceptron
    Saha, Nilan
    Das, Pratyush
    Saha, Himadri Nath
    INTERNATIONAL JOURNAL OF APPLIED PATTERN RECOGNITION, 2018, 5 (03) : 251 - 259
  • [9] Stacked authorship attribution of digital texts
    Custodio, Jose Eleandro
    Paraboni, Ivandre
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 176
  • [10] Automatic authorship attribution in Albanian texts
    Misini, Arta
    Canhasi, Ercan
    Kadriu, Arbana
    Fetahi, Endrit
    PLOS ONE, 2024, 19 (10):