Author attribution of Turkish texts by feature mining

被引:0
|
作者
Tuerkoglu, Filiz [1 ]
Diri, Banu [1 ]
Amasyali, M. Fatih [1 ]
机构
[1] Yildiz Tech Univ, Comp Engn, TR-34349 Istanbul, Turkey
关键词
author attribution; n-grams; text classification; feature extraction; Turkish documents;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The aim of this study is to identify the author of an unauthorized document. Ten different feature vectors are obtained from authorship attributes, n-grams and various combinations of these feature vectors that are extracted from documents, which the authors are intended to be identified. Comparative performance of every feature vector is analyzed by applying Naive Bayes, SVM, k-NN, RE and MLP classification methods. The most successful classifiers are MLP and SVM. In document classification process, it is observed that n-grams give higher accuracy rates than authorship attributes. Nevertheless, using n-gram and authorship attributes together, gives better results than when each is used alone.
引用
收藏
页码:1086 / +
页数:3
相关论文
共 50 条
  • [1] Author Attribution of Literary Texts in Polish by the Sequence Averaging
    Walkowiak, Tomasz
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2022, PT II, 2023, 13589 : 367 - 376
  • [2] Determining of Discriminative Blog Size for Authorship Attribution on the Turkish Texts
    Canbay, Pelin
    Sever, Hayri
    Sezer, Ebru Akcapinar
    [J]. 2018 6TH INTERNATIONAL SYMPOSIUM ON DIGITAL FORENSIC AND SECURITY (ISDFS), 2018, : 319 - 323
  • [3] Universal Dependencies and Author Attribution of Short Texts with Syntax Alone
    Gorman, Robert
    [J]. DIGITAL HUMANITIES QUARTERLY, 2022, 16 (02):
  • [4] Consonance as a Stylistic Feature for Authorship Attribution of Historical Texts
    Ivanov, Lubomir
    Neilsen, Brandon
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2019), 2019, 11697 : 45 - 57
  • [5] Authorship Attribution for Short Texts with Author-Document Topic Model
    Zhang, Haowen
    Nie, Peng
    Wen, Yanlong
    Yuan, Xiaojie
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2018), PT I, 2018, 11061 : 29 - 41
  • [6] Feature Selection for Enhanced Author Identification of Turkish Text
    Bay, Yasemin
    Celebi, Erbug
    [J]. INFORMATION SCIENCES AND SYSTEMS 2015, 2016, 363 : 371 - 379
  • [7] Author and genre identification of Turkish news texts using deep learning algorithms
    PINAR TÜFEKCİ
    MELİKE BEKTAŞ
    [J]. Sādhanā, 47
  • [8] Author and genre identification of Turkish news texts using deep learning algorithms
    Tufekci, Pinar
    Bektas, Melike
    [J]. SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2022, 47 (04):
  • [9] Genre and Author Detection in Turkish Texts Using Artificial Immune Recognition Systems
    Kaban, Zafer
    Diri, Banu
    [J]. 2008 IEEE 16TH SIGNAL PROCESSING, COMMUNICATION AND APPLICATIONS CONFERENCE, VOLS 1 AND 2, 2008, : 46 - 49
  • [10] Unethical author attribution
    Weijer, C
    Akabayashi, A
    [J]. CAMBRIDGE QUARTERLY OF HEALTHCARE ETHICS, 2003, 12 (01) : 124 - 130