"Liar, Liar Pants on Fire": A New Benchmark Dataset for Fake News Detection

被引:490
|
作者
Wang, William Yang [1 ]
机构
[1] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA
关键词
D O I
10.18653/v1/P17-2067
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Automatic fake news detection is a challenging problem in deception detection, and it has tremendous real-world political and social impacts. However, statistical approaches to combating fake news has been dramatically limited by the lack of labeled benchmark datasets. In this paper, we present LIAR: a new, publicly available dataset for fake news detection. We collected a decade-long, 12.8K manually labeled short statements in various contexts from POLITIFACT.COM, which provides detailed analysis report and links to source documents for each case. This dataset can be used for fact-checking research as well. Notably, this new dataset is an order of magnitude larger than previously largest public fake news datasets of similar type. Empirically, we investigate automatic fake news detection based on surface-level linguistic patterns. We have designed a novel, hybrid convolutional neural network to integrate meta-data with text. We show that this hybrid approach can improve a text-only deep learning model.
引用
收藏
页码:422 / 426
页数:5
相关论文
共 50 条
  • [21] A comprehensive Benchmark for fake news detection
    Galli, Antonio
    Masciari, Elio
    Moscato, Vincenzo
    Sperli, Giancarlo
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2022, 59 (01) : 237 - 261
  • [22] "Bend the truth": Benchmark dataset for fake news detection in Urdu language and its evaluation
    Amjad, Maaz
    Sidorov, Grigori
    Zhila, Alisa
    Gomez-Adorno, Helena
    Voronkov, Ilia
    Gelbukh, Alexander
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (02) : 2457 - 2469
  • [23] The PolitiFact-Oslo Corpus: A New Dataset for Fake News Analysis and Detection
    Poldvere, Nele
    Uddin, Zia
    Thomas, Aleena
    [J]. INFORMATION, 2023, 14 (12)
  • [24] LIMESODA: Dataset for Fake News Detection in Healthcare Domain
    Payoungkhamdee, Patomporn
    Porkaew, Peerachet
    Sinthunyathum, Atthasith
    Songphum, Phattharaphon
    Kawidam, Witsarut
    Loha-Udom, Wichayut
    Boonkwan, Prachya
    Sutantayawalee, Vipas
    [J]. 16TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING (ISAI-NLP 2021), 2021,
  • [25] Dataset for multimodal fake news detection and verification tasks
    Bondielli, Alessandro
    Dell'Oglio, Pietro
    Lenci, Alessandro
    Marcelloni, Francesco
    Passaro, Lucia
    [J]. DATA IN BRIEF, 2024, 54
  • [26] A benchmark study of machine learning models for online fake news detection
    Khan, Junaed Younus
    Khondaker, Md. Tawkat Islam
    Afroz, Sadia
    Uddin, Gias
    Iqbal, Anindya
    [J]. MACHINE LEARNING WITH APPLICATIONS, 2021, 4
  • [27] FACTCK.BR: A New Dataset to Study Fake News
    Moreno, Joao
    Bressan, Graca
    [J]. WEBMEDIA 2019: PROCEEDINGS OF THE 25TH BRAZILLIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB, 2019, : 525 - 527
  • [28] Fake content detection on benchmark dataset using various deep learning models
    Thaokar, Chetana
    Rout, Jitendra Kumar
    Das, Himansu
    Rout, Minakhi
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2024, 27 (05)
  • [29] AFND: Arabic fake news dataset for the detection and classification of articles credibility
    Khalil, Ashwaq
    Jarrah, Moath
    Aldwairi, Monther
    Jaradat, Manar
    [J]. DATA IN BRIEF, 2022, 42
  • [30] A systemic literature overview of Fake News Challenge (FNC-1) dataset and its use in fake news detection schemes
    Jawad, Zainab A.
    Obaid, Ahmed J.
    [J]. JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2023, 26 (04): : 1197 - 1206