An empiric validation of linguistic features in machine learning models for fake news detection

被引:2
|
作者
Puraivan, Eduardo [1 ,2 ]
Venegas, Rene [3 ]
Riquelme, Fabian [2 ]
机构
[1] Univ Vina del Mar, Escuela Ciencias, Vina del Mar, Chile
[2] Univ Valparaiso, Escuela Ingn Informat, Valparaiso, Chile
[3] Pontificia Univ Catoica Valparaiso, Inst Literatura & Ciencias Lenguaje, Valparaiso, Chile
关键词
Fake news; Mass media; Natural language processing; Linguistic features; Machine learning;
D O I
10.1016/j.datak.2023.102207
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The diffusion of fake news is a growing problem with a high and negative social impact. There are several approaches to address the detection of fake news. This work focuses on a hybrid approach based on functional linguistic features and machine learning. There are several recent works with this approach. However, there are no clear guidelines on which linguistic features are most appropriate nor how to justify their use. Furthermore, many classification results are modest compared to recent advances in natural language processing. Our proposal considers 88 features organized in surface information, part of speech, discursive characteristics, and read-ability indices. On a 42 677 news database, we show that the classification results outperform previous work, even outperforming state-of-the-art techniques such as BERT, reaching 99.99% accuracy. A proper selection of linguistic features is crucial for interpretability as well as the performance of the models. In this sense, our proposal contributes to the intentional selection of linguistic features, overcoming current technical issues. We identified 32 features that show differences between the type of news. The results are highly competitive in the classification and simple to implement and interpret.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] Advancements in Fake News Detection Using Machine and Deep Learning Models: Comprehensive Literature Review
    Alkomah, Bushra
    Sheldon, Frederick
    2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 845 - 852
  • [42] Leveraging contextual features to enhanced machine learning models in detecting COVID-19 fake news
    Qasem A.E.
    Sajid M.
    International Journal of Information Technology, 2024, 16 (5) : 3233 - 3241
  • [43] A Comparative Study of Machine Learning and Deep Learning Techniques for Fake News Detection
    Alghamdi, Jawaher
    Lin, Yuqing
    Luo, Suhuai
    INFORMATION, 2022, 13 (12)
  • [44] Comparison of Fake News Detection using Machine Learning and Deep Learning Techniques
    Alameri, Saeed Amer
    Mohd, Masnizah
    2021 3RD INTERNATIONAL CYBER RESILIENCE CONFERENCE (CRC), 2021, : 101 - 106
  • [45] A deep neural network approach for fake news detection using linguistic and psychological features
    Arunthavachelvan, Keshopan
    Raza, Shaina
    Ding, Chen
    USER MODELING AND USER-ADAPTED INTERACTION, 2024, 34 (04) : 1043 - 1070
  • [46] A fusion of BERT, machine learning and manual approach for fake news detection
    Mohammed A. Al Ghamdi
    Muhammad Shahid Bhatti
    Atif Saeed
    Zeeshan Gillani
    Sultan H. Almotiri
    Multimedia Tools and Applications, 2024, 83 : 30095 - 30112
  • [47] Enhancing Information Integrity: Machine Learning Methods for Fake News Detection
    Sahu, Shruti
    Bansal, Poonam
    Kumari, Ritika
    FOURTH CONGRESS ON INTELLIGENT SYSTEMS, VOL 1, CIS 2023, 2024, 868 : 247 - 257
  • [48] Fake News Detection in Social Networks Using Machine Learning Techniques
    Saeed, Ammar
    Al Solami, Eesa
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (04): : 778 - 784
  • [49] A Location Independent Machine Learning Approach for Early Fake News Detection
    Liu, Haohui
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 4740 - 4746
  • [50] PegasosQSVM: A Quantum Machine Learning Approach for Accurate Fake News Detection
    Khalil, Mehdi
    Zhang, Chi
    Ye, Zhiwe
    Zhang, Peng
    APPLIED ARTIFICIAL INTELLIGENCE, 2025, 39 (01)