Applying authorship analysis to arabic web content

被引:0
|
作者
Abbasi, A [1 ]
Chen, HC [1 ]
机构
[1] Univ Arizona, Dept Management Informat Syst, Tucson, AZ 85721 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The advent and rapid proliferation of internet communication has allowed the realization of numerous security issues. The anonymous nature of online mediums such as email, web sites, and forums provides an attractive communication method for criminal activity. Increased globalization and the boundless nature of the internet have further amplified these concerns due to the addition of a multilingual dimension. The world's social and political climate has caused Arabic to draw a great deal of attention. In this study we apply authorship identification techniques to Arabic web forum messages. Our research uses lexical, syntactic, structural, and content-specific writing style features for authorship identification. We address some of the problematic characteristics of Arabic in route to the development of an Arabic language model that provides a respectable level of classification accuracy for authorship discrimination. We also run experiments to evaluate the effectiveness of different feature types and classification techniques on our dataset.
引用
收藏
页码:183 / 197
页数:15
相关论文
共 50 条
  • [1] Applying authorship analysis to extremist-group web forum messages
    Abbasi, A
    Chen, HC
    [J]. IEEE INTELLIGENT SYSTEMS, 2005, 20 (05) : 67 - 75
  • [2] Applying Web Service to Distributed Multimedia Content Analysis
    Zhang, Shilin
    Wang, Hui
    [J]. 2009 INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT, INNOVATION MANAGEMENT AND INDUSTRIAL ENGINEERING, VOL 1, PROCEEDINGS, 2009, : 159 - +
  • [3] Content-based analysis to detect Arabic web spam
    Al-Kabi, Mohammed
    Wahsheh, Heider
    Alsmadi, Izzat
    Al-Shawakfa, Emad
    Wahbeh, Abdullah
    Al-Hmoud, Ahmed
    [J]. JOURNAL OF INFORMATION SCIENCE, 2012, 38 (03) : 284 - 296
  • [4] Conceptual Search for Arabic Web Content
    Al-Zoghby, Aya M.
    Shaalan, Khaled
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT II, 2015, 9042 : 405 - 416
  • [5] Quality Assessment of Arabic Web Content: The case of the Arabic Wikipedia
    Yahya, Adnan
    Salhi, Ali
    [J]. 2014 10TH INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION TECHNOLOGY (IIT), 2014, : 36 - 41
  • [6] Web design for dyslexics: Accessibility of Arabic content
    Al-Wabil, Areej
    Zaphiris, Panayiotis
    Wilson, Stephanie
    [J]. COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, PROCEEDINGS, 2006, 4061 : 817 - 822
  • [7] Arabic Web-Based Information on Oral Lichen Planus: Content Analysis
    AlMeshrafi, Azzam
    AlHamad, Arwa F.
    AlKuraidees, Hamoud
    AlNasser, Lubna A.
    [J]. JMIR FORMATIVE RESEARCH, 2024, 8
  • [8] The microscope and the moving target: The challenge of applying content analysis to the World Wide Web
    McMillan, SJ
    [J]. JOURNALISM & MASS COMMUNICATION QUARTERLY, 2000, 77 (01) : 80 - 98
  • [9] On Authorship Authentication of Arabic Articles
    Alwajeeh, Ahmed
    Al-Ayyoub, Mahmoud
    Hmeidi, Ismail
    [J]. 2014 5TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2014,
  • [10] Authorship Attribution of Arabic Articles
    Hajja, Maha
    Yahya, Ahmad
    Yahya, Adnan
    [J]. ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, ICALP 2019, 2019, 1108 : 194 - 208