WELFake: Word Embedding Over Linguistic Features for Fake News Detection

被引:104
|
作者
Verma, Pawan Kumar [1 ,2 ]
Agrawal, Prateek [2 ,3 ]
Amorim, Ivone [4 ,5 ]
Prodan, Radu [3 ]
机构
[1] GLA Univ, Dept Comp Engn & Applicat, Mathura 281406, India
[2] Lovely Profess Univ, Sch Comp Sci & Engn, Phagwara 144411, India
[3] Univ Klagenfurt, Inst Informat Technol, A-9020 Klagenfurt, Austria
[4] MOG Technol, P-4470605 Moreira, Portugal
[5] Univ Porto, CMUP Math Res Ctr, P-4099002 Porto, Portugal
基金
欧盟地平线“2020”;
关键词
Social networking (online); Linguistics; Data models; Bit error rate; Feature extraction; Training; Vegetation; Bidirectional encoder representations from transformer (BERT); convolutional neural network (CNN); fake news; linguistic feature; machine learning (ML); text classification; voting classifier; word embedding (WE); DECEPTION; CUES;
D O I
10.1109/TCSS.2021.3068519
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Social media is a popular medium for the dissemination of real-time news all over the world. Easy and quick information proliferation is one of the reasons for its popularity. An extensive number of users with different age groups, gender, and societal beliefs are engaged in social media websites. Despite these favorable aspects, a significant disadvantage comes in the form of fake news, as people usually read and share information without caring about its genuineness. Therefore, it is imperative to research methods for the authentication of news. To address this issue, this article proposes a two-phase benchmark model named WELFake based on word embedding (WE) over linguistic features for fake news detection using machine learning classification. The first phase preprocesses the data set and validates the veracity of news content by using linguistic features. The second phase merges the linguistic feature sets with WE and applies voting classification. To validate its approach, this article also carefully designs a novel WELFake data set with approximately 72 000 articles, which incorporates different data sets to generate an unbiased classification output. Experimental results show that the WELFake model categorizes the news in real and fake with a 96.73% which improves the overall accuracy by 1.31% compared to bidirectional encoder representations from transformer (BERT) and 4.25% compared to convolutional neural network (CNN) models. Our frequency-based and focused analyzing writing patterns model outperforms predictive-based related works implemented using the Word2vec WE method by up to 1.73%.
引用
收藏
页码:881 / 893
页数:13
相关论文
共 50 条
  • [41] Albanian Fake News Detection
    Canhasi, Ercan
    Shijaku, Rexhep
    Berisha, Erblin
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (05)
  • [42] A Tool for Fake News Detection
    Al Asaad, Bashar
    Erascu, Madalina
    2018 20TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2018), 2019, : 379 - 386
  • [43] Fake news detection on Twitter
    Sharma, Srishti
    Saraswat, Mala
    Dubey, Anil Kumar
    INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2022, 18 (5/6) : 388 - 412
  • [44] Fake News Pattern Recognition using Linguistic Analysis
    Dey, Amitabha
    Rafi, Rafsan Zani
    Parash, Shahriar Hasan
    Arko, Sauvik Kundu
    Chakrabarty, Amitabha
    2018 JOINT 7TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV) AND 2018 2ND INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR), 2018, : 305 - 309
  • [45] Fake news: linguistic aspects of a new textual genre
    Scarpanti, Edoardo
    FORUM ITALICUM, 2021, 55 (03) : 865 - 877
  • [46] Feature analysis of fake news: improving fake news detection in social media
    Leung, Johnathan
    Vatsalan, Dinusha
    Arachchilage, Nalin
    Journal of Cyber Security Technology, 2023, 7 (04) : 224 - 241
  • [47] Escaping the neutralization effect of modality features fusion in multimodal Fake News Detection
    Wang, Bing
    Li, Ximing
    Li, Changchun
    Wang, Shengsheng
    Gao, Wanfu
    INFORMATION FUSION, 2024, 111
  • [48] FN2: Fake News DetectioN Based on Textual and Contextual Features
    Rabhi, Mouna
    Bakiras, Spiridon
    Di Pietro, Roberto
    INFORMATION AND COMMUNICATIONS SECURITY, ICICS 2022, 2022, 13407 : 472 - 491
  • [49] A comprehensive survey of fake news in social networks: Attributes, features, and detection approaches
    Kondamudi, Medeswara Rao
    Sahoo, Somya Ranjan
    Chouhan, Lokesh
    Yadav, Nandakishor
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (06)
  • [50] Enhancing hierarchical attention networks with CNN and stylistic features for fake news detection
    Alghamdi, Jawaher
    Lin, Yuqing
    Luo, Suhuai
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 257