Exploiting Textual Information for Fake News Detection

被引:2
|
作者
Kasseropoulos, Dimitrios Panagiotis [1 ]
Koukaras, Paraskevas [1 ]
Tjortjis, Christos [1 ]
机构
[1] Hellen Univ, Sch Sci & Technol Int, Data Min & Analyt Res Grp, 14th Km Thessaloniki N Moudania, Thessaloniki 57001, Greece
关键词
Fake news; Machine Learning (ML); Artificial Neural Networks (ANN); Natural Language Processing (NLP); Association Rules Mining (ARM); SOCIAL MEDIA; CLASSIFICATION; SELECTION;
D O I
10.1142/S0129065722500587
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
"Fake news" refers to the deliberate dissemination of news with the purpose to deceive and mislead the public. This paper assesses the accuracy of several Machine Learning (ML) algorithms, using a style-based technique that relies on textual information extracted from news, such as part of speech counts. To expand the already proposed styled-based techniques, a new method of enhancing a linguistic feature set is proposed. It combines Named Entity Recognition (NER) with the Frequent Pattern (FP) Growth association rule mining algorithm, aiming to provide better insight into the papers' sentence level structure. Recursive feature elimination was used to identify a subset of the highest performing linguistic characteristics, which turned out to align with the literature. Using pre-trained word embeddings, document embeddings and weighted document embeddings were constructed using each word's TF-IDF value as the weight factor. The document embeddings were mixed with the linguistic features providing a variety of training/test feature sets. For each model, the best performing feature set was identified and fine-tuned regarding its hyper parameters to improve accuracy. ML algorithms' results were compared with two Neural Networks: Convolutional Neural Network (CNN) and Long-Short-Term Memory (LSTM). The results indicate that CNN outperformed all other methods in terms of accuracy, when companied with pre-trained word embeddings, yet SVM performs almost the same with a wider variety of input feature sets. Although style-based technique scores lower accuracy, it provides explainable results about the author's writing style decisions. Our work points out how new technologies and combinations of existing techniques can enhance the style-based approach capturing more information.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Multimodal Fake News Detection with Textual, Visual and Semantic Information
    Giachanou, Anastasia
    Zhang, Guobiao
    Rosso, Paolo
    TEXT, SPEECH, AND DIALOGUE (TSD 2020), 2020, 12284 : 30 - 38
  • [2] Exploiting Multi-domain Visual Information for Fake News Detection
    Qi, Peng
    Cao, Juan
    Yang, Tianyun
    Guo, Junbo
    Li, Jintao
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 518 - 527
  • [3] Fake News Detection Utilizing Textual Cues
    Chouliara, Vasiliki
    Koukaras, Paraskevas
    Tjortjis, Christos
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2023, PT I, 2023, 675 : 393 - 403
  • [4] Arabic Fake News Detection Based on Textual Analysis
    Hanen Himdi
    George Weir
    Fatmah Assiri
    Hassanin Al-Barhamtoshy
    Arabian Journal for Science and Engineering, 2022, 47 : 10453 - 10469
  • [5] Arabic Fake News Detection Based on Textual Analysis
    Himdi, Hanen
    Weir, George
    Assiri, Fatmah
    Al-Barhamtoshy, Hassanin
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2022, 47 (08) : 10453 - 10469
  • [6] Exploiting Content Characteristics for Explainable Detection of Fake News
    Muñoz, Sergio
    Iglesias, Carlos Á.
    Big Data and Cognitive Computing, 2024, 8 (10)
  • [7] Deep Learning for Fake News Detection in a Pairwise Textual Input Schema
    Mouratidis, Despoina
    Nikiforos, Maria Nefeli
    Kermanidis, Katia Lida
    COMPUTATION, 2021, 9 (02) : 1 - 15
  • [8] Bias Detection and Mitigation in Textual Data: A Study on Fake News and Hate Speech Detection
    Kasampalis, Apostolos
    Chatzakou, Despoina
    Tsikrika, Theodora
    Vrochidis, Stefanos
    Kompatsiaris, Ioannis
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT III, 2024, 14610 : 374 - 383
  • [9] Exploiting stance similarity and graph neural networks for fake news detection
    Soga, Kayato
    Yoshida, Soh
    Muneyasu, Mitsuji
    PATTERN RECOGNITION LETTERS, 2024, 177 : 26 - 32
  • [10] A deep learning approach for detecting fake reviewers: Exploiting reviewing behavior and textual information
    Zhang, Dong
    Li, Wenwen
    Niu, Baozhuang
    Wu, Chong
    DECISION SUPPORT SYSTEMS, 2023, 166