Utilizing Latent Posting Style for Authorship Attribution on Short Texts

被引:2
|
作者
Leepaisomboon, Patamawadee [1 ]
Iwaihara, Mizuho [1 ]
机构
[1] Waseda Univ, Grad Sch Informat Prod & Syst, Fukuoka, Japan
关键词
Latent Dirichlet allocation; authorship attribution; sentiment; short text; twitter; support vector machine; social network;
D O I
10.1109/DASC/PiCom/CBDCom/CyberSciTech.2019.00184
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Character n-grams and word n-grams are the most widely used features for authorship attribution on short texts. In this paper, we propose a new method which exploits latent posting styles estimated from authors' short texts. The new posting style features characterize each user's posting style through sentiment orientation and post length. Concise hidden posting styles are captured by Latent Dirichlet Allocation (LDA), where we consider two types of LDA models. Then the vectors of latent posting styles are concatenated with averaged word embeddings of character n-grams and word n-grams, to be used to train a support vector machine. Our results show that combining latent posting styles with the traditional features can improve the accuracy of authorship attribution up to 5.2%.
引用
收藏
页码:1015 / 1022
页数:8
相关论文
共 50 条
  • [21] Using Lexical Stress in Authorship Attribution of Historical Texts
    Ivanov, Lubomir
    Petrovic, Smiljana
    TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 105 - 113
  • [22] A Comparison of Several AI Techniques for Authorship Attribution on Romanian Texts
    Avram, Sanda-Maria
    Oltean, Mihai
    MATHEMATICS, 2022, 10 (23)
  • [23] Authorship Attribution for Polish Texts Based on Part of Speech Tagging
    Szwed, Piotr
    BEYOND DATABASES, ARCHITECTURES AND STRUCTURES: TOWARDS EFFICIENT SOLUTIONS FOR DATA ANALYSIS AND KNOWLEDGE REPRESENTATION, 2017, 716 : 316 - 328
  • [24] Determining of Discriminative Blog Size for Authorship Attribution on the Turkish Texts
    Canbay, Pelin
    Sever, Hayri
    Sezer, Ebru Akcapinar
    2018 6TH INTERNATIONAL SYMPOSIUM ON DIGITAL FORENSIC AND SECURITY (ISDFS), 2018, : 319 - 323
  • [25] EVALUATION AND QUANTIFICATION OF SOME TECHNIQUES OF "ATTRIBUTION OF AUTHORSHIP" IN SPANISH TEXTS
    Blasco, Javier
    Ruiz Urbon, Cristina
    CASTILLA-ESTUDIOS DE LITERATURA, 2009, : 27 - 47
  • [26] On the role of words in the network structure of texts: Application to authorship attribution
    Akimushkin, Camilo
    Amancio, Diego R.
    Oliveira, Osvaldo N., Jr.
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2018, 495 : 49 - 58
  • [27] A Supervised Learning Approach for Authorship Attribution of Bengali Literary Texts
    Phani, Shanta
    Lahiri, Shibamouli
    Biswas, Arindam
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2017, 16 (04)
  • [28] A Transformer-Based Approach to Authorship Attribution in Classical Arabic Texts
    AlZahrani, Fetoun Mansour
    Al-Yahya, Maha
    APPLIED SCIENCES-BASEL, 2023, 13 (12):
  • [29] Semantic Clustering and Transfer Learning in Social Media Texts Authorship Attribution
    Fedotova, Anastasia
    Kurtukova, Anna
    Romanov, Aleksandr
    Shelupanov, Alexander
    IEEE ACCESS, 2024, 12 : 39783 - 39803
  • [30] A Computational Approach for Authorship Attribution of Literary Texts using Sintatic Features
    Varela, Paulo
    Justino, Edson
    Britto, Alceu, Jr.
    Bortolozzi, Flavio
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 4835 - 4842