Stylometric Fake News Detection Based on Natural Language Processing Using Named Entity Recognition: In-Domain and Cross-Domain Analysis

被引：3

作者：

Tsai, Chih-Ming ^{[1
]}

机构：

[1] Natl Chin Yi Univ Technol, Dept Ind Engn & Management, 57 Sec 2, Zhongshan Rd, Taichung 411030, Taiwan

来源：

ELECTRONICS | 2023年 / 12卷 / 17期

关键词：

fake news; stylometric detection; natural language processing (NLP); named entity recognition (NER);

D O I：

10.3390/electronics12173676

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Nowadays, the dissemination of news information has become more rapid, liberal, and open to the public. People can find what they want to know more and more easily from a variety of sources, including traditional news outlets and new social media platforms. However, at a time when our lives are glutted with all kinds of news, we cannot help but doubt the veracity and legitimacy of these news sources; meanwhile, we also need to guard against the possible impact of various forms of fake news. To combat the spread of misinformation, more and more researchers have turned to natural language processing (NLP) approaches for effective fake news detection. However, in the face of increasingly serious fake news events, existing detection methods still need to be continuously improved. This study proposes a modified proof-of-concept model named NER-SA, which integrates natural language processing (NLP) and named entity recognition (NER) to conduct the in-domain and cross-domain analysis of fake news detection with the existing three datasets simultaneously. The named entities associated with any particular news event exist in a finite and available evidence pool. Therefore, entities must be mentioned and recognized in this entity bank in any authentic news articles. A piece of fake news inevitably includes only some entitlements in the entity bank. The false information is deliberately fabricated with fictitious, imaginary, and even unreasonable sentences and content. As a result, there must be differences in statements, writing logic, and style between legitimate news and fake news, meaning that it is possible to successfully detect fake news. We developed a mathematical model and used the simulated annealing algorithm to find the optimal legitimate area. Comparing the detection performance of the NER-SA model with current state-of-the-art models proposed in other studies, we found that the NER-SA model indeed has superior performance in detecting fake news. For in-domain analysis, the accuracy increased by an average of 8.94% on the LIAR dataset and 19.36% on the fake or real news dataset, while the F1-score increased by an average of 24.04% on the LIAR dataset and 19.36% on the fake or real news dataset. In cross-domain analysis, the accuracy and F1-score for the NER-SA model increased by an average of 28.51% and 24.54%, respectively, across six domains in the FakeNews AMT dataset. The findings and implications of this study are further discussed with regard to their significance for improving accuracy, understanding context, and addressing adversarial attacks. The development of stylometric detection based on NLP approaches using NER techniques can improve the effectiveness and applicability of fake news detection.

引用

页数：16

共 50 条

[1] Cross-Domain Failures of Fake News Detection
Janicka, Maria
Pszona, Maria
Wawer, Aleksander
[J]. COMPUTACION Y SISTEMAS, 2019, 23 (03): : 1089 - 1097
[2] Data Augmentation for Cross-Domain Named Entity Recognition
Chen, Shuguang
Aguilar, Gustavo
Neves, Leonardo
Solorio, Thamar
[J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 5346 - 5356
[3] CrossNER: Evaluating Cross-Domain Named Entity Recognition
Liu, Zihan
Xu, Yan
Yu, Tiezheng
Dai, Wenliang
Ji, Ziwei
Cahyawijaya, Samuel
Madotto, Andrea
Fung, Pascale
[J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 13452 - 13460
[4] Cross-Domain Fake News Detection Using a Prompt-Based Approach
Alghamdi, Jawaher
Lin, Yuqing
Luo, Suhuai
[J]. FUTURE INTERNET, 2024, 16 (08)
[5] Damage detection using in-domain and cross-domain transfer learning
Zaharah A. Bukhsh
Nils Jansen
Aaqib Saeed
[J]. Neural Computing and Applications, 2021, 33 : 16921 - 16936
[6] Embracing Domain Differences in Fake News: Cross-domain Fake News Detection using Multi-modal Data
Silva, Amila
Luo, Ling
Karunasekera, Shanika
Leckie, Christopher
[J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 557 - 565
[7] Zero-Resource Cross-Domain Named Entity Recognition
Liu, Zihan
Winata, Genta Indra
Fung, Pascale
[J]. 5TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2020), 2020, : 1 - 6
[8] Damage detection using in-domain and cross-domain transfer learning
Bukhsh, Zaharah A.
Jansen, Nils
Saeed, Aaqib
[J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (24): : 16921 - 16936
[9] Cross-domain Named Entity Recognition via Graph Matching
Zheng, Junhao
Chen, Haibin
Ma, Qianli
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 2670 - 2680
[10] Transfer Joint Embedding for Cross-Domain Named Entity Recognition
Pan, Sinno Jialin
Toh, Zhiqiang
Su, Jian
[J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2013, 31 (02)

← 1 2 3 4 5 →