LIMESODA: Dataset for Fake News Detection in Healthcare Domain

被引:1
|
作者
Payoungkhamdee, Patomporn [1 ]
Porkaew, Peerachet [2 ]
Sinthunyathum, Atthasith [1 ]
Songphum, Phattharaphon [1 ]
Kawidam, Witsarut [1 ]
Loha-Udom, Wichayut [1 ]
Boonkwan, Prachya [2 ]
Sutantayawalee, Vipas [1 ]
机构
[1] Backyard Co Ltd, Bangkok, Thailand
[2] NECTEC, Bangkok, Thailand
关键词
D O I
10.1109/iSAI-NLP54397.2021.9678187
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present our Thai fake news dataset in the healthcare domain, LIMESODA, with the construction guideline. Each document in the dataset is classified as fact, fake, or undefined. Moreover, we also provide token-level annotations for validating classifier decisions. Five high-level annotation tags l are 1) misleading headline 2) imposter 3) fabrication 4) false connection and 5) misleading content. We curate and manually annotated 7,191 documents with these tags. We evaluate our dataset with two deep learning approaches; RNN and Transformer baselines and analyzed token-level contributions to understand model behaviors. For the RNN model, we use the attention weights as token-level contributions. For Transformer models, we use the integrated gradient method at the embedding layers. We finally compared these token-level contributions with human annotations. Although our baseline models yield promising performances, we found that tokens that support model decisions are quite different from human annotation.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Multimodal Fake News Detection on Fakeddit Dataset Using Transformer-Based Architectures
    Kalra, Sakshi
    Kumar, Chitneedi Hemanth Sai
    Sharma, Yashvardhan
    Chauhan, Gajendra Singh
    [J]. MACHINE LEARNING, IMAGE PROCESSING, NETWORK SECURITY AND DATA SCIENCES, MIND 2022, PT II, 2022, 1763 : 281 - 292
  • [32] Multimodal Fake News Detection on Fakeddit Dataset Using Transformer-Based Architectures
    Kalra, Sakshi
    Kumar, Chitneedi Hemanth Sai
    Sharma, Yashvardhan
    Chauhan, Gajendra Singh
    [J]. Communications in Computer and Information Science, 2022, 1763 CCIS : 281 - 292
  • [33] A Hybrid Model for Effective Fake News Detection with a Novel COVID-19 Dataset
    Kaliyar, Rohit Kumar
    Goswami, Anurag
    Narang, Pratik
    [J]. ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2, 2021, : 1066 - 1072
  • [34] "Bend the truth": Benchmark dataset for fake news detection in Urdu language and its evaluation
    Amjad, Maaz
    Sidorov, Grigori
    Zhila, Alisa
    Gomez-Adorno, Helena
    Voronkov, Ilia
    Gelbukh, Alexander
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (02) : 2457 - 2469
  • [35] Information Management in Healthcare and Environment: Towards an Automatic System for Fake News Detection
    Lara-Navarra, Pablo
    Falciani, Herve
    Sanchez-Perez, Enrique A.
    Ferrer-Sapena, Antonia
    [J]. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2020, 17 (03)
  • [36] Multimodal Fake News Detection
    Segura-Bedmar, Isabel
    Alonso-Bartolome, Santiago
    [J]. INFORMATION, 2022, 13 (06)
  • [37] Albanian Fake News Detection
    Canhasi, Ercan
    Shijaku, Rexhep
    Berisha, Erblin
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (05)
  • [38] A Tool for Fake News Detection
    Al Asaad, Bashar
    Erascu, Madalina
    [J]. 2018 20TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2018), 2019, : 379 - 386
  • [39] Fake news detection on Twitter
    Sharma, Srishti
    Saraswat, Mala
    Dubey, Anil Kumar
    [J]. INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2022, 18 (5/6) : 388 - 412
  • [40] Evaluating Deep Neural Networks for Automatic Fake News Detection in Political Domain
    Fernandez-Reyes, Francis C.
    Shinde, Suraj
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2018, 2018, 11238 : 206 - 216