Big Data and quality data for fake news and misinformation detection

被引:63
|
作者
Asr, Fatemeh Torabi [1 ]
Taboada, Maite [1 ]
机构
[1] Simon Fraser Univ, Burnaby, BC, Canada
来源
BIG DATA & SOCIETY | 2019年 / 6卷 / 01期
基金
加拿大自然科学与工程研究理事会;
关键词
Fake news; misinformation; labelled datasets; text classification; machine learning; topic modelling; INFORMATION;
D O I
10.1177/2053951719843310
中图分类号
C [社会科学总论];
学科分类号
03 ; 0303 ;
摘要
Fake news has become an important topic of research in a variety of disciplines including linguistics and computer science. In this paper, we explain how the problem is approached from the perspective of natural language processing, with the goal of building a system to automatically detect misinformation in news. The main challenge in this line of research is collecting quality data, i.e., instances of fake and real news articles on a balanced distribution of topics. We review available datasets and introduce the MisInfoText repository as a contribution of our lab to the community. We make available the full text of the news articles, together with veracity labels previously assigned based on manual assessment of the articles' truth content. We also perform a topic modelling experiment to elaborate on the gaps and sources of imbalance in currently available datasets to guide future efforts. We appeal to the community to collect more data and to make it available for research purposes.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] CITIZENS AS TARGETS: ELECTORAL CAMPAIGNS AND POLITICAL DISPUTES STRATEGIES IN THE ERA OF BIG DATA AND FAKE NEWS
    Barbosa, Laise Milena
    Pereira dos Santos, Julia Raquel do Lago
    dos Anjos, Alexsandro
    da Cruz, Fabricio Bittencourt
    [J]. HUMANIDADES & INOVACAO, 2021, 8 (48): : 66 - 81
  • [42] FAKE NEWS AND ALIENS: MISINFORMATION IN RESPONSE TO SOCIAL STIGMA
    Barrutia Navarrete, Mercedes J.
    [J]. REVISTA INCLUSIONES, 2020, 7 : 286 - 305
  • [43] Beyond Fake News: Finding the Truth in a World of Misinformation
    Schroeder, Sarah Bartlett
    [J]. LIBRARY JOURNAL, 2022, 147 (06) : 52 - 52
  • [44] Fake news on drugs: post-truth and misinformation
    Pasquim, Heitor
    Oliveira, Marcos
    Soares, Cassia Baldini
    [J]. SAUDE E SOCIEDADE, 2020, 29 (02):
  • [46] Profiling Fake News: Learning the Semantics and Characterisation of Misinformation
    Agarwal, Swati
    Samavedhi, Adithya
    [J]. ADVANCED DATA MINING AND APPLICATIONS, ADMA 2021, PT I, 2022, 13087 : 203 - 216
  • [47] Fake news, disinformation and misinformation in social media: a review
    Aimeur, Esma
    Amri, Sabrine
    Brassard, Gilles
    [J]. SOCIAL NETWORK ANALYSIS AND MINING, 2023, 13 (01)
  • [48] Fake news, disinformation and misinformation in social media: a review
    Esma Aïmeur
    Sabrine Amri
    Gilles Brassard
    [J]. Social Network Analysis and Mining, 13
  • [49] Containing misinformation: Modeling spatial games of fake news
    Jones, Matthew, I
    Pauls, Scott D.
    Fu, Feng
    [J]. PNAS NEXUS, 2024, 3 (03):