A Comparative Survey of Authorship Attribution on Short Arabic Texts

被引:3
|
作者
Ouamour, Siham [1 ]
Sayoud, Halim [1 ]
机构
[1] Univ Sci & Technol Houari Boumediene, Algiers, Algeria
来源
关键词
Natural language processing; Artificial intelligence; Authorship attribution; Arabic language; Short texts; Text-mining; DISCRIMINATION;
D O I
10.1007/978-3-319-99579-3_50
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we deal with the problem of authorship attribution ( AA) on short Arabic texts. So, we make a survey on a set of several features and classifiers that are employed for the task of AA. This investigation uses characters, character bigrams, character trigrams, character tetragrams, words, word bigrams and rare words. The AA is ensured by 4 different measures, 3 classifiers (Multi-Layer Perceptron (MLP), Support Vector Machines (SVM) and Linear Regression (LR)) and a new proposed fusion called VBF (i.e. Vote Based Fusion). The evaluation is done on short Arabic texts extracted from the AAAT dataset (AA of Ancient Arabic Texts). Although the task of AA is known to be difficult on short texts, the different results have revealed interesting information on the performances of the features and classification techniques on Arabic text data. For instance, character-based features appear to be better than word-based features for short texts. Furthermore, the proposed VBF fusion provided high performances with an accuracy of 90% of good AA, which is higher than the score of the original classifier using only one feature. Globally, the results of this investigation shed light on the efficiency and pertinency of several features and classifiers in AA of short Arabic texts.
引用
收藏
页码:479 / 489
页数:11
相关论文
共 50 条
  • [1] Authorship Attribution of Short Historical Arabic Texts Based on Lexical Features
    Ouamour, S.
    Sayoud, H.
    [J]. 2013 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY (CYBERC), 2013, : 144 - 147
  • [2] Performance of authorship attribution classifiers with short texts: application of religious Arabic fatwas
    Al-Sarem, Mohammed
    Emara, Abdel-Hamid
    Wahab, Ahmed Abdel
    [J]. INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2020, 12 (03) : 350 - 364
  • [3] Naive Bayes classifiers for authorship attribution of Arabic texts
    Altheneyan, Alaa Saleh
    Menai, Mohamed El Bachir
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2014, 26 (04) : 473 - 484
  • [4] Authorship Attribution on Short Texts in the Slovenian Language
    Gabrovsek, Gregor
    Peer, Peter
    Emersic, Ziga
    Batagelj, Borut
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (19):
  • [5] Survey of Authorship Identification Tasks on Arabic Texts
    Alqahtani, Fatimah
    Dohler, Mischa
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (04)
  • [6] A Transformer-Based Approach to Authorship Attribution in Classical Arabic Texts
    AlZahrani, Fetoun Mansour
    Al-Yahya, Maha
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (12):
  • [7] Towards Authorship Attribution in Arabic Short-Microblog Text
    Jambi, Kamal Mansour
    Khan, Imtiaz Hussain
    Siddiqui, Muazzam Ahmed
    Alhaj, Salma Omar
    [J]. IEEE ACCESS, 2021, 9 : 128506 - 128520
  • [8] Authorship Attribution of Arabic Articles
    Hajja, Maha
    Yahya, Ahmad
    Yahya, Adnan
    [J]. ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, ICALP 2019, 2019, 1108 : 194 - 208
  • [9] Utilizing Latent Posting Style for Authorship Attribution on Short Texts
    Leepaisomboon, Patamawadee
    Iwaihara, Mizuho
    [J]. IEEE 17TH INT CONF ON DEPENDABLE, AUTONOM AND SECURE COMP / IEEE 17TH INT CONF ON PERVAS INTELLIGENCE AND COMP / IEEE 5TH INT CONF ON CLOUD AND BIG DATA COMP / IEEE 4TH CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2019, : 1015 - 1022
  • [10] Authorship Attribution of Arabic Tweets
    Rabab'ah, Abdullateef
    Al-Ayyoub, Mahmoud
    Jararweh, Yaser
    Aldwairi, Monther
    [J]. 2016 IEEE/ACS 13TH INTERNATIONAL CONFERENCE OF COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2016,