PRINCIPAL COMPONENT ANALYSIS FOR AUTHORSHIP ATTRIBUTION

被引:0
|
作者
Jamak, Amir [1 ]
Savatic, Alen [1 ]
Can, Mehmet [1 ]
机构
[1] Int Univ Sarajevo, Fac Engn & Nat Sci, Hrasnicka Cesta 15, Sarajevo 71000, Bosnia & Herceg
关键词
principal components; authorship attribution; stylometry; text categorization; function words; classification task; stylistic features; syntactic characteristics;
D O I
暂无
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
A common problem in statistical pattern recognition is that of feature selection or feature extraction. Feature selection refers to a process whereby a data space is transformed into a feature space that, in theory, has exactly the same dimension as the original data space. However, the transformation is designed in such a way that the data set may be represented by a reduced number of "effective" features and yet retain most of the intrinsic information content of the data; in other words, the data set undergoes a dimensionality reduction. In this paper the data collected by counting words and characters in around a thousand paragraphs of each sample book underwent a principal component analysis performed using heural networks. Then first of the principal components is used to distinguished the books authored by a certain author.
引用
收藏
页码:189 / 196
页数:8
相关论文
共 50 条
  • [1] Authorship attribution using principal component analysis and competitive neural networks
    Can, Mehmet
    Mathematical and Computational Applications, 2014, 19 (01) : 21 - 36
  • [2] Principal component analysis and authorship
    Dooner, Nathan
    DIGITAL SCHOLARSHIP IN THE HUMANITIES, 2023, 38 (04) : 1482 - 1493
  • [3] AUTHORSHIP ATTRIBUTION
    HOLMES, DI
    COMPUTERS AND THE HUMANITIES, 1994, 28 (02): : 87 - 106
  • [4] Android authorship attribution through string analysis
    Kalgutkar, Vaibhavi
    Stakhanova, Natalia
    Cook, Paul
    Matyukhina, Alina
    13TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY (ARES 2018), 2019,
  • [5] A Stylometric Analysis on Bengali Literature For Authorship Attribution
    Hossain, M. Tahmid
    Rahman, Md Moshiur
    Ismail, Sabir
    Islam, Md Saiful
    2017 20TH INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2017,
  • [6] Authorship Attribution Analysis of Thai Online Messages
    Marukatat, Rangsipan
    Somkiadcharoen, Robroo
    Nalintasnai, Ratthanan
    Aramboonpong, Tappasarn
    2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND APPLICATIONS (ICISA), 2014,
  • [7] Analysis of Source Code Authorship Attribution Problem
    Bogdanova, Alina
    Farina, Mirko
    Kholmatova, Zamira
    Kruglov, Artem
    Romanov, Vitaly
    Succi, Giancarlo
    2022 INTERNATIONAL CONFERENCE ON COMPUTERS AND ARTIFICIAL INTELLIGENCE TECHNOLOGIES, CAIT, 2022, : 109 - 115
  • [8] Authorship Attribution and Pastiche
    Harold Somers
    Fiona Tweedie
    Computers and the Humanities, 2003, 37 : 407 - 429
  • [9] Authorship Attribution System
    Marchenko, Oleksandr
    Anisimov, Anatoly
    Nykonenko, Andrii
    Rossada, Tetiana
    Melnikov, Egor
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2017, 2017, 10260 : 227 - 231
  • [10] Versification and Authorship Attribution
    Gomez Camelo, Laura Camila
    Munoz Landinez, Valeria
    LITERATURA-TEORIA HISTORIA CRITICA, 2023, 25 (02): : 308 - 315