Dataset for Arabic Fake News

被引:5
|
作者
Assaf, Rasha [1 ]
Saheb, Mahmoud [1 ]
机构
[1] Palestine Polytech Univ, Hebron, Palestine
关键词
Fabricated contents; Annotations; Cohen's Kappa;
D O I
10.1109/AICT52784.2021.9620228
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
the adaptation of social media platforms allows the fast spread of misinformation, which can mislead the public. This dissemination of information and usage of the internet enables users to create and share massive amounts of information, some of which are unreliable. Fake news has become an important social issue for researchers to tackle. Few English fake news datasets were published and numerous machine learning approaches were proposed for news reliability classification. However, up to now, there is a limited reliable Arabic dataset for fake news detection. This paper is a data paper in which we present a new dataset of Arabic fake news. The data was collected from various sources including PalKashif. The articles and news segments were labeled by two experts. The dataset contains about 500 news segments and the inter-annotator agreement measured using Cohen's Kappa is 0.807. The dataset will be published for public use on Githubi(1).
引用
收藏
页数:4
相关论文
共 50 条
  • [41] Not all Fake News is Written: A Dataset and Analysis of Misleading Video Headlines
    Sung, Yoo Yeon
    Boyd-Graber, Jordan
    Hassan, Naeemul
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023,
  • [42] NewsPolyML: Multi-lingual European News Fake Assessment Dataset
    Mohtaj, Salar
    Nizamoglu, Ata
    Sahitaj, Premtim
    Jakob, Charlott
    Moeller, Sebastian
    Schmitt, Vera
    PROCEEDINGS OF THE 3RD ACM INTERNATIONAL WORKSHOP ON MULTIMEDIA AI AGAINST DISINFORMATION, MAD 2024, 2024, : 82 - 90
  • [43] Fake news: The truth about fake news
    Jimenez-Rodriguez, Alvaro
    REVISTA MEDITERRANEA COMUNICACION-JOURNAL OF COMMUNICATION, 2020, 11 (02): : 331 - 333
  • [44] Pre-Trained Language Model Ensemble for Arabic Fake News Detection
    Al-Zahrani, Lama
    Al-Yahya, Maha
    MATHEMATICS, 2024, 12 (18)
  • [45] Detecting Fake News: Exploring Key Features in Multilingual Arabic Dialect Corpus
    Hocini, Abdelouahab
    Smaili, Kamel
    ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, ICALP 2023, PT II, 2025, 2340 : 236 - 248
  • [46] Fake News Detection in Arabic Tweets during the COVID-19 Pandemic
    Mahlous, Ahmed Redha
    Al-Laith, Ali
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (06) : 776 - 785
  • [47] Arabic Fake News Detection: A Fact Checking Based Deep Learning Approach
    Harrag, Fouzi
    Djahli, Mohamed Khalil
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (04)
  • [48] Amina: an Arabic multi-purpose integral news articles dataset
    Zaytoon, Mohamed
    Bashar, Muhannad
    Khamis, Mohamed A.
    Gomaa, Walid
    Neural Computing and Applications, 2024, 36 (35) : 22149 - 22169
  • [49] SMAD: Text Classification of Arabic Social Media Dataset for News Sources
    Gaber, Amira M.
    El-din, Mohamed Nour
    Moussa, Hanan
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (10) : 508 - 516
  • [50] The PolitiFact-Oslo Corpus: A New Dataset for Fake News Analysis and Detection
    Poldvere, Nele
    Uddin, Zia
    Thomas, Aleena
    INFORMATION, 2023, 14 (12)