Stylometric Fake News Detection Based on Natural Language Processing Using Named Entity Recognition: In-Domain and Cross-Domain Analysis

被引:3
|
作者
Tsai, Chih-Ming [1 ]
机构
[1] Natl Chin Yi Univ Technol, Dept Ind Engn & Management, 57 Sec 2, Zhongshan Rd, Taichung 411030, Taiwan
关键词
fake news; stylometric detection; natural language processing (NLP); named entity recognition (NER);
D O I
10.3390/electronics12173676
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, the dissemination of news information has become more rapid, liberal, and open to the public. People can find what they want to know more and more easily from a variety of sources, including traditional news outlets and new social media platforms. However, at a time when our lives are glutted with all kinds of news, we cannot help but doubt the veracity and legitimacy of these news sources; meanwhile, we also need to guard against the possible impact of various forms of fake news. To combat the spread of misinformation, more and more researchers have turned to natural language processing (NLP) approaches for effective fake news detection. However, in the face of increasingly serious fake news events, existing detection methods still need to be continuously improved. This study proposes a modified proof-of-concept model named NER-SA, which integrates natural language processing (NLP) and named entity recognition (NER) to conduct the in-domain and cross-domain analysis of fake news detection with the existing three datasets simultaneously. The named entities associated with any particular news event exist in a finite and available evidence pool. Therefore, entities must be mentioned and recognized in this entity bank in any authentic news articles. A piece of fake news inevitably includes only some entitlements in the entity bank. The false information is deliberately fabricated with fictitious, imaginary, and even unreasonable sentences and content. As a result, there must be differences in statements, writing logic, and style between legitimate news and fake news, meaning that it is possible to successfully detect fake news. We developed a mathematical model and used the simulated annealing algorithm to find the optimal legitimate area. Comparing the detection performance of the NER-SA model with current state-of-the-art models proposed in other studies, we found that the NER-SA model indeed has superior performance in detecting fake news. For in-domain analysis, the accuracy increased by an average of 8.94% on the LIAR dataset and 19.36% on the fake or real news dataset, while the F1-score increased by an average of 24.04% on the LIAR dataset and 19.36% on the fake or real news dataset. In cross-domain analysis, the accuracy and F1-score for the NER-SA model increased by an average of 28.51% and 24.54%, respectively, across six domains in the FakeNews AMT dataset. The findings and implications of this study are further discussed with regard to their significance for improving accuracy, understanding context, and addressing adversarial attacks. The development of stylometric detection based on NLP approaches using NER techniques can improve the effectiveness and applicability of fake news detection.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Cross-Domain Failures of Fake News Detection
    Janicka, Maria
    Pszona, Maria
    Wawer, Aleksander
    [J]. COMPUTACION Y SISTEMAS, 2019, 23 (03): : 1089 - 1097
  • [2] Data Augmentation for Cross-Domain Named Entity Recognition
    Chen, Shuguang
    Aguilar, Gustavo
    Neves, Leonardo
    Solorio, Thamar
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 5346 - 5356
  • [3] CrossNER: Evaluating Cross-Domain Named Entity Recognition
    Liu, Zihan
    Xu, Yan
    Yu, Tiezheng
    Dai, Wenliang
    Ji, Ziwei
    Cahyawijaya, Samuel
    Madotto, Andrea
    Fung, Pascale
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 13452 - 13460
  • [4] Cross-Domain Fake News Detection Using a Prompt-Based Approach
    Alghamdi, Jawaher
    Lin, Yuqing
    Luo, Suhuai
    [J]. FUTURE INTERNET, 2024, 16 (08)
  • [5] Damage detection using in-domain and cross-domain transfer learning
    Zaharah A. Bukhsh
    Nils Jansen
    Aaqib Saeed
    [J]. Neural Computing and Applications, 2021, 33 : 16921 - 16936
  • [6] Embracing Domain Differences in Fake News: Cross-domain Fake News Detection using Multi-modal Data
    Silva, Amila
    Luo, Ling
    Karunasekera, Shanika
    Leckie, Christopher
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 557 - 565
  • [7] Zero-Resource Cross-Domain Named Entity Recognition
    Liu, Zihan
    Winata, Genta Indra
    Fung, Pascale
    [J]. 5TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2020), 2020, : 1 - 6
  • [8] Damage detection using in-domain and cross-domain transfer learning
    Bukhsh, Zaharah A.
    Jansen, Nils
    Saeed, Aaqib
    [J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (24): : 16921 - 16936
  • [9] Cross-domain Named Entity Recognition via Graph Matching
    Zheng, Junhao
    Chen, Haibin
    Ma, Qianli
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 2670 - 2680
  • [10] Transfer Joint Embedding for Cross-Domain Named Entity Recognition
    Pan, Sinno Jialin
    Toh, Zhiqiang
    Su, Jian
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2013, 31 (02)