A debiased self-training framework with graph self-supervised pre-training aided for semi-supervised rumor detection

被引:0
|
作者
Qiao, Yuhan [1 ]
Cui, Chaoqun [1 ]
Wang, Yiying [1 ]
Jia, Caiyan [1 ]
机构
[1] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing Key Lab Traff Data Anal & Min, Beijing 100044, Peoples R China
基金
中国国家自然科学基金;
关键词
Rumor detection; Self-training; Semi-supervised learning; Self-supervised learning; Confirmation bias; Graph representation; PROPAGATION; NETWORK;
D O I
10.1016/j.neucom.2024.128314
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing rumor detection models have achieved remarkable performance in fully-supervised settings. However, it is time-consuming and labor-intensive to obtain extensive labeled rumor data. To mitigate the reliance on labeled data, semi-supervised learning (SSL), jointly learning from labeled and unlabeled samples, achieves significant performance improvements at low costs. Commonly used self-training methods in SSL, despite their simplicity and efficiency, suffer from the notorious confirmation bias, which can be seen as the accumulation of noise arising from utilization of incorrect pseudo-labels. To deal with the problem, in this study, we propose a debiased self-training framework with graph self-supervised pre-training for semi-supervised rumor detection. First, to enhance the initial model for self-training and reduce the generation of incorrect pseudo-labels in early stages, we leverage the rumor propagation structures of massive unlabeled data for graph self-supervised pre-training. Second, we improve the quality of pseudo-labels by proposing a pseudo-labeling strategy with self- adaptive thresholds, which consists of self-paced global thresholds controlling the overall utilization process of pseudo-labels and local class-specific thresholds attending to the learning status of each class. Extensive experiments on four public benchmarks demonstrate that our method significantly outperforms previous rumor detection baselines in semi-supervised settings, especially when labeled samples are extremely scarce. Notably, we have achieved 96.3% accuracy on Weibo with 500 labels per class and 86.0% accuracy with just 5 labels per class.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Debiased Self-Training for Semi-Supervised Learning
    Chen, Baixu
    Jiang, Junguang
    Wang, Ximei
    Wan, Pengfei
    Wang, Jianmin
    Long, Mingsheng
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [2] Self-supervised Pre-training and Semi-supervised Learning for Extractive Dialog Summarization
    Zhuang, Yingying
    Song, Jiecheng
    Sadagopan, Narayanan
    Beniwal, Anurag
    [J]. COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 1069 - 1076
  • [3] Self-supervised Pre-training for Mirror Detection
    Lin, Jiaying
    Lau, Rynson W. H.
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 12193 - 12202
  • [4] Self-supervised ECG pre-training
    Liu, Han
    Zhao, Zhenbo
    She, Qiang
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 70
  • [5] Comparing Self-Supervised Pre-Training and Semi-Supervised Training for Speech Recognition in Languages with Weak Language Models
    Lam-Yee-Mui, Lea-Marie
    Yang, Lucas Ondel
    Klejch, Ondrej
    [J]. INTERSPEECH 2023, 2023, : 87 - 91
  • [6] Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training
    Zhang, Bowen
    Cao, Songjun
    Zhang, Xiaoming
    Zhang, Yike
    Ma, Long
    Shinozaki, Takahiro
    [J]. INTERSPEECH 2022, 2022, : 2653 - 2657
  • [7] Semi-supervised self-training of object detection models
    Rosenberg, C
    Hebert, M
    Schneiderman, H
    [J]. WACV 2005: SEVENTH IEEE WORKSHOP ON APPLICATIONS OF COMPUTER VISION, PROCEEDINGS, 2005, : 29 - 36
  • [8] UniVIP: A Unified Framework for Self-Supervised Visual Pre-training
    Li, Zhaowen
    Zhu, Yousong
    Yang, Fan
    Li, Wei
    Zhao, Chaoyang
    Chen, Yingying
    Chen, Zhiyang
    Xie, Jiahao
    Wu, Liwei
    Zhao, Rui
    Tang, Ming
    Wang, Jinqiao
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 14607 - 14616
  • [9] EFFECTIVENESS OF SELF-SUPERVISED PRE-TRAINING FOR ASR
    Baevski, Alexei
    Mohamed, Abdelrahman
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7694 - 7698
  • [10] Self-supervised Pre-training for Nuclei Segmentation
    Haq, Mohammad Minhazul
    Huang, Junzhou
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT II, 2022, 13432 : 303 - 313