Unifying Lexical, Syntactic, and Structural Representations of Written Language for Authorship Attribution

被引:0
|
作者
Jafariakinabad F. [1 ]
Hua K.A. [1 ]
机构
[1] University of Central Florida, Orlando, FL
关键词
Authorship attribution; Deep neural networks; Document analysis; Natural language processing; Syntax encoding;
D O I
10.1007/s42979-021-00911-2
中图分类号
学科分类号
摘要
Writing style in written language is a combination of consistent decisions associated with a specific author at different levels of language production, including lexical, syntactic, and structural. The recent work in neural network based style analysis mainly lacks the multi-level modeling of writing style. In this paper, we introduce a style-aware neural model to encode document information from three stylistic levels and evaluate it in the domain of authorship attribution. First, we propose a simple way to jointly encode syntactic and lexical representations of sentences. Subsequently, we employ an attention-based hierarchical neural network to encode the syntactic and semantic structure of sentences in documents while rewarding the sentences which contribute more in capturing the writing style. Our experimental results, based on four benchmark datasets, reveal the benefits of encoding document information from all three stylistic levels when compared to the baseline methods in the literature. Additionally, We adopt a transfer learning approach and use deep contextualized word representation (ELMo) in our model to measure the impact of lower level linguistic representations versus higher level linguistic representations of ELMo in the task of authorship attribution. According to our experimental results, lower level linguistic representations which mainly carry syntactic information demonstrate better performance in authorship attribution task when compared to higher level linguistic representations which mainly carry semantic information. © 2021, The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd.
引用
收藏
相关论文
共 50 条
  • [41] Activation of lexical and syntactic target language properties in translation
    Ruiz, C.
    Paredes, N.
    Macizo, P.
    Bajo, M. T.
    ACTA PSYCHOLOGICA, 2008, 128 (03) : 490 - 500
  • [42] A Comparative Analysis of Word Embedding Representations in Authorship Attribution of Bengali Literature
    Chowdhury, Hemayet Ahmed
    Imon, Md Azizul Haque
    Islam, Md Saiful
    2018 21ST INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2018,
  • [43] The lexical boost effect is not diagnostic of lexically-specific syntactic representations
    Scheepers, Christoph
    Raffray, Claudine N.
    Myachykov, Andriy
    JOURNAL OF MEMORY AND LANGUAGE, 2017, 95 : 102 - 115
  • [44] Asymmetric lexical access and fuzzy lexical representations in second language learners
    Darcy, Isabelle
    Daidone, Danielle
    Kojima, Chisato
    MENTAL LEXICON, 2013, 8 (03): : 372 - 420
  • [45] The relationship between thematic, lexical, and syntactic features of written texts and personality traits
    Jakovljev, Ivana
    Milin, Petar
    PSIHOLOGIJA, 2017, 50 (01) : 67 - 84
  • [46] Deep Stylometry and Lexical & Syntactic Features Based Author Attribution on PLoS Digital Repository
    Hassan, Saeed-Ul
    Imran, Mubashir
    Iftikhar, Tehreem
    Safder, Iqra
    Shabbir, Mudassir
    DIGITAL LIBRARIES: DATA, INFORMATION, AND KNOWLEDGE FOR DIGITAL LIVES, 2017, 10647
  • [47] Syntactic Encoding in Written Language Production by Deaf Writers: A Structural Priming Study and a Comparison With Hearing Writers
    Cai, Zhenguang G.
    Zhao, Nan
    Lin, Hao
    Xu, Zebo
    Thierfelder, Philip
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY-LEARNING MEMORY AND COGNITION, 2023, 49 (06) : 974 - 989
  • [48] The timing of lexical and syntactic processes in second language sentence comprehension
    Hopp, Holger
    APPLIED PSYCHOLINGUISTICS, 2016, 37 (05) : 1253 - 1280
  • [49] Domain Adaptation for Authorship Attribution: Improved Structural Correspondence Learning
    Sapkota, Upendra
    Solorio, Thamar
    Montes-y-Gomez, Manuel
    Bethard, Steven
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 2226 - 2235
  • [50] Syntactic Priming in Language Acquisition: Representations, mechanisms and applications
    Pontikas, George
    FIRST LANGUAGE, 2023, 43 (04) : 461 - 463