Automatic detection of fake tweets about the COVID-19 Vaccine in Portuguese

被引:1
|
作者
Geurgas, Rafael [1 ]
Tessler, Leandro R. [1 ]
机构
[1] Univ Estadual Campinas, IFGW, BR-13083970 Campinas, SP, Brazil
基金
巴西圣保罗研究基金会;
关键词
Disinformation; COVID-19; Neural networks; Automatic classification;
D O I
10.1007/s13278-024-01216-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The COVID-19 pandemic induced an unprecedented wave of disinformation in social media in Brazil. In particular, Twitter (currently X) was used to spread fake news about COVID-19 vaccines that helped to induce vaccine hesitation. This article presents a BERT-based neural network for the automatic detection of fake tweets. The optimized architecture relies upon BERTimbau, a BERT implementation pre-trained in Brazilian Portuguese, fine-tuned using three fully connected layers. All 2,857,908 tweets in Portuguese containing the word vacina (vaccine in Portuguese) were collected over 7 months. A random subset of 16,731 tweets was manually classified as real or fake. Of these, 2309 were discarded for not being about non-COVID-19 vaccines and 422 were discarded for containing irony. Of the remaining 14,000 tweets, 1144 were labeled fake and 12,856 were real. To balance the training dataset, the network was fine-tuned using the 1144 curated fake tweets and a random sample of 2000 real tweets. Optimal results were achieved by melting the last four layers of the BERTimbau. The best results obtained were 77.1% F1-score and 76.9% accuracy. These results are already acceptable for practical applications. They can be improved by increasing the size of the training dataset. A weighted 96.3% F1-score was obtained by training the same neural network architecture and hyperparameters with a larger curated balanced English language training dataset.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Spanish Corpora of tweets about COVID-19 vaccination for automatic stance detection
    Martinez, Ruben Yanez
    Blanco, Guillermo
    Lourenco, Analia
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (03)
  • [2] Fake News Detection in Arabic Tweets during the COVID-19 Pandemic
    Mahlous, Ahmed Redha
    Al-Laith, Ali
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (06) : 776 - 785
  • [3] Exploring the Impact of Machine Translation on Fake News Detection: A Case Study on Persian Tweets about COVID-19
    Saghayan, Masood Hamed
    Ebrahimi, Seyedeh Fatemeh
    Bahrani, Mohammad
    2021 29TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2021, : 540 - 544
  • [4] Text Clustering of COVID-19 Vaccine Tweets
    David, Ukwen
    Karabatak, Murat
    2022 10TH INTERNATIONAL SYMPOSIUM ON DIGITAL FORENSICS AND SECURITY (ISDFS), 2022,
  • [5] Stance Detection in COVID-19 Tweets
    Glandt, Kyle
    Khanal, Sarthak
    Li, Yingjie
    Caragea, Doina
    Caragea, Cornelia
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 1596 - 1611
  • [6] Fake News Detection of South African COVID-19 Related Tweets using Machine Learning
    Khan, Yaseen
    Thakur, Surendra
    5TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, BIG DATA, COMPUTING AND DATA COMMUNICATION SYSTEMS (ICABCD2022), 2022,
  • [7] Evaluating The Preliminary Models to Identify Fake News on COVID-19 Tweets
    Sari, Ayu Mutiara
    Ariyani, Nurul Fajrin
    Ahmadiyah, Adhatus Solichah
    PROCEEDINGS OF 2021 13TH INTERNATIONAL CONFERENCE ON INFORMATION & COMMUNICATION TECHNOLOGY AND SYSTEM (ICTS), 2021, : 336 - 341
  • [8] Detection of Misinformation About COVID-19 in Brazilian Portuguese WhatsApp Messages
    Forte Martins, Antonio Diogo
    Cabral, Lucas
    Chaves Mourao, Pedro Jorge
    Monteiro, Jose Maria
    Machado, Javam
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2021), 2021, 12801 : 199 - 206
  • [9] Covid-19 Fake News Detection: A Survey
    Shushkevich, Elena
    Alexandrov, Mikhail
    Cardiff, John
    COMPUTACION Y SISTEMAS, 2021, 25 (04): : 783 - 792
  • [10] Sentiment analysis tracking of COVID-19 vaccine through tweets
    Sarirete, Akila
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2022, 14 (11) : 14661 - 14669