LIMESODA: Dataset for Fake News Detection in Healthcare Domain

被引:1
|
作者
Payoungkhamdee, Patomporn [1 ]
Porkaew, Peerachet [2 ]
Sinthunyathum, Atthasith [1 ]
Songphum, Phattharaphon [1 ]
Kawidam, Witsarut [1 ]
Loha-Udom, Wichayut [1 ]
Boonkwan, Prachya [2 ]
Sutantayawalee, Vipas [1 ]
机构
[1] Backyard Co Ltd, Bangkok, Thailand
[2] NECTEC, Bangkok, Thailand
关键词
D O I
10.1109/iSAI-NLP54397.2021.9678187
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present our Thai fake news dataset in the healthcare domain, LIMESODA, with the construction guideline. Each document in the dataset is classified as fact, fake, or undefined. Moreover, we also provide token-level annotations for validating classifier decisions. Five high-level annotation tags l are 1) misleading headline 2) imposter 3) fabrication 4) false connection and 5) misleading content. We curate and manually annotated 7,191 documents with these tags. We evaluate our dataset with two deep learning approaches; RNN and Transformer baselines and analyzed token-level contributions to understand model behaviors. For the RNN model, we use the attention weights as token-level contributions. For Transformer models, we use the integrated gradient method at the embedding layers. We finally compared these token-level contributions with human annotations. Although our baseline models yield promising performances, we found that tokens that support model decisions are quite different from human annotation.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] IFND: a benchmark dataset for fake news detection
    Dilip Kumar Sharma
    Sonal Garg
    [J]. Complex & Intelligent Systems, 2023, 9 : 2843 - 2863
  • [2] IFND: a benchmark dataset for fake news detection
    Sharma, Dilip Kumar
    Garg, Sonal
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (03) : 2843 - 2863
  • [3] Fake News Detection with the New German Dataset "GermanFakeNC"
    Vogel, Inna
    Jiang, Peter
    [J]. DIGITAL LIBRARIES FOR OPEN KNOWLEDGE, TPDL 2019, 2019, 11799 : 288 - 295
  • [4] Dataset for multimodal fake news detection and verification tasks
    Bondielli, Alessandro
    Dell'Oglio, Pietro
    Lenci, Alessandro
    Marcelloni, Francesco
    Passaro, Lucia
    [J]. DATA IN BRIEF, 2024, 54
  • [5] Dataset for Arabic Fake News
    Assaf, Rasha
    Saheb, Mahmoud
    [J]. 2021 IEEE 15TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT2021), 2021,
  • [6] Cross-Domain Failures of Fake News Detection
    Janicka, Maria
    Pszona, Maria
    Wawer, Aleksander
    [J]. COMPUTACION Y SISTEMAS, 2019, 23 (03): : 1089 - 1097
  • [7] MDFEND: Multi-domain Fake News Detection
    Nan, Qiong
    Cao, Juan
    Zhu, Yongchun
    Wang, Yanyan
    Li, Jintao
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3343 - 3347
  • [8] Building a framework for fake news detection in the health domain
    Martinez-Rico, Juan R.
    Araujo, Lourdes
    Martinez-Romo, Juan
    [J]. PLOS ONE, 2024, 19 (07):
  • [9] Annotation-Scheme Reconstruction for "Fake News" and Japanese Fake News Dataset
    Murayama, Taichi
    Hisada, Shohei
    Uehara, Makoto
    Wakamiya, Shoko
    Aramaki, Eiji
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 7226 - 7234
  • [10] Annotation-Scheme Reconstruction for Fake News and Japanese Fake News Dataset
    Murayama, Taichi
    Hisada, Shohei
    Uehara, Makoto
    Wakamiya, Shoko
    Aramaki, Eiji
    [J]. arXiv, 2022,