Automatic detection of fake tweets about the COVID-19 Vaccine in Portuguese

被引:1
|
作者
Geurgas, Rafael [1 ]
Tessler, Leandro R. [1 ]
机构
[1] Univ Estadual Campinas, IFGW, BR-13083970 Campinas, SP, Brazil
基金
巴西圣保罗研究基金会;
关键词
Disinformation; COVID-19; Neural networks; Automatic classification;
D O I
10.1007/s13278-024-01216-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The COVID-19 pandemic induced an unprecedented wave of disinformation in social media in Brazil. In particular, Twitter (currently X) was used to spread fake news about COVID-19 vaccines that helped to induce vaccine hesitation. This article presents a BERT-based neural network for the automatic detection of fake tweets. The optimized architecture relies upon BERTimbau, a BERT implementation pre-trained in Brazilian Portuguese, fine-tuned using three fully connected layers. All 2,857,908 tweets in Portuguese containing the word vacina (vaccine in Portuguese) were collected over 7 months. A random subset of 16,731 tweets was manually classified as real or fake. Of these, 2309 were discarded for not being about non-COVID-19 vaccines and 422 were discarded for containing irony. Of the remaining 14,000 tweets, 1144 were labeled fake and 12,856 were real. To balance the training dataset, the network was fine-tuned using the 1144 curated fake tweets and a random sample of 2000 real tweets. Optimal results were achieved by melting the last four layers of the BERTimbau. The best results obtained were 77.1% F1-score and 76.9% accuracy. These results are already acceptable for practical applications. They can be improved by increasing the size of the training dataset. A weighted 96.3% F1-score was obtained by training the same neural network architecture and hyperparameters with a larger curated balanced English language training dataset.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] How to avoid fake COVID-19 vaccine passports as a travel requirement?
    Rocha, Ian Christopher N.
    JOURNAL OF PUBLIC HEALTH, 2022, 44 (04) : E608 - E609
  • [32] COVID-19 Vaccine Tweets After Vaccine Rollout: Sentiment-Based Topic Modeling
    Huangfu, Luwen
    Mo, Yiwen
    Zhang, Peijie
    Zeng, Daniel Dajun
    He, Saike
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2022, 24 (02)
  • [33] Using artificial intelligence techniques for detecting Covid-19 epidemic fake news in Moroccan tweets
    Madani, Youness
    Erritali, Mohammed
    Bouikhalene, Belaid
    RESULTS IN PHYSICS, 2021, 25
  • [34] Sentimental Analysis of COVID-19 Vaccine Tweets Using BERT plus NBSVM
    Umair, Areeba
    Masciari, Elio
    Madeo, Giusi
    Ullah, Muhammad Habib
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT I, 2023, 1752 : 238 - 247
  • [35] Detection of Fake News on COVID-19 on Web Search Engines
    Mazzeo, Valeria
    Rapisarda, Andrea
    Giuffrida, Giovanni
    FRONTIERS IN PHYSICS, 2021, 9
  • [36] Detection of fake news on COVID-19 on web search engines
    Mazzeo, Valeria
    Rapisarda, A.
    Giuffrida, G.
    arXiv, 2021,
  • [37] COVID-19 Infodemic in Malaysia: Conceptualizing Fake News for Detection
    Lim, Chee Kuan
    Zainol, Zurinahni
    Omar, Bahiyah
    Ibrahim, Noor Farizah
    ADVANCES IN MULTIMEDIA, 2023, 2023
  • [38] Cross-lingual COVID-19 Fake News Detection
    Du, Jiangshu
    Dou, Yingtong
    Xia, Congying
    Cui, Limeng
    Ma, Jing
    Yu, Philip S.
    21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS ICDMW 2021, 2021, : 859 - 862
  • [39] Understanding the Impact of and Analysing Fake News About COVID-19 in SA
    Mthethwa, Sthembile
    Dlamini, Nelisiwe
    Mkuzangwe, Nenekazi
    Shibambu, Avuya
    Boateng, Thato
    Mantsi, Motlatsi
    DISINFORMATION IN OPEN ONLINE MEDIA, MISDOOM 2021, 2021, 12887 : 66 - 84
  • [40] False memory and COVID-19: How people fall for fake news about COVID-19 in digital contexts
    Mangiulli, Ivan
    Battista, Fabiana
    Kafi, Nadja Abdel
    Coveliers, Eline
    Webster, Theodore Carlson
    Curci, Antonietta
    Otgaar, Henry
    FRONTIERS IN PSYCHOLOGY, 2022, 13