Improving Semi-Supervised Text Classification with Dual Meta-Learning

被引:1
|
作者
Li, Shujie [1 ]
Yuan, Guanghu [1 ]
Yang, Min [1 ]
Shen, Ying [2 ]
Li, Chengming [2 ]
Xu, Ruifeng [3 ]
Zhao, Xiaoyan [1 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, 1068 Xueyuan Ave,Univ Town,Xili, Shenzhen 518055, Guangdong, Peoples R China
[2] Sun Yat Sen Univ, Sch Intelligent Syst Engn, 66 Gongchang Rd, Guangzhou, Guangdong, Peoples R China
[3] Harbin Inst Technol Shenzhen, Shenzhen 518055, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Semi-supervised text classification; pseudo labeling; noise transition matrix; meta learning; consistency regularization;
D O I
10.1145/3648612
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The goal of semi-supervised text classification (SSTC) is to train a model by exploring both a small number of labeled data and a large number of unlabeled data, such that the learned semi-supervised classifier performs better than the supervised classifier trained on solely the labeled samples. Pseudo-labeling is one of the most widely used SSTC techniques, which trains a teacher classifier with a small number of labeled examples to predict pseudo labels for the unlabeled data. The generated pseudo-labeled examples are then utilized to train a student classifier, such that the learned student classifier can outperform the teacher classifier. Nevertheless, the predicted pseudo labels may be inaccurate, making the performance of the student classifier degraded. The student classifier may perform even worse than the teacher classifier. To alleviate this issue, in this paper, we introduce a dual meta-learning (DML) technique for semi-supervised text classification, which improves the teacher and student classifiers simultaneously in an iterative manner. Specifically, we propose a meta-noise correction method to improve the student classifier by proposing a Noise Transition Matrix (NTM) with meta-learning to rectify the noisy pseudo labels. In addition, we devise a meta pseudo supervision method to improve the teacher classifier. Concretely, we exploit the feedback performance from the student classifier to further guide the teacher classifier to produce more accurate pseudo labels for the unlabeled data. In this way, both teacher and student classifiers can co-evolve in the iterative training process. Extensive experiments on four benchmark datasets highlight the effectiveness of our DML method against existing state-of-theart methods for semi-supervised text classification. We release our code and data of this paper publicly at https://github.com/GRIT621/DML.
引用
收藏
页数:28
相关论文
共 50 条
  • [21] GraphixMatch: Improving semi-supervised learning for graph classification with FixMatch
    Koh, Eunji
    Lee, Young Jae
    Kim, Seoung Bum
    NEUROCOMPUTING, 2024, 607
  • [22] Improving automatic query classification via semi-supervised learning
    Beitzel, SM
    Jensen, EC
    Frieder, O
    Lewis, DD
    Chowdhury, A
    Kolcz, A
    Fifth IEEE International Conference on Data Mining, Proceedings, 2005, : 42 - 49
  • [23] Towards Low-Resource Semi-Supervised Dialogue Generation with Meta-Learning
    Huang, Yi
    Feng, Junlan
    Ma, Shuo
    Du, Xiaoyu
    Wu, Xiaoting
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4123 - 4128
  • [24] Short text classification algorithm based on semi-supervised learning and SVM
    Yin, Chunyong
    Xiang, Jun
    Zhang, Hui
    Yin, Zhichao
    Wang, Jin
    International Journal of Multimedia and Ubiquitous Engineering, 2015, 10 (12): : 195 - 206
  • [25] Text Classification Using Semi-Supervised Clustering
    Zhang, Wen
    Yoshida, Taketoshi
    Tang, Xijin
    2009 INTERNATIONAL CONFERENCE ON BUSINESS INTELLIGENCE AND FINANCIAL ENGINEERING, PROCEEDINGS, 2009, : 197 - 200
  • [26] Variational Autoencoder for Semi-Supervised Text Classification
    Xu, Weidi
    Sun, Haoze
    Deng, Chao
    Tan, Ying
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3358 - 3364
  • [27] Variational Pretraining for Semi-supervised Text Classification
    Gururangan, Suchin
    Dang, Tam
    Card, Dallas
    Smith, Noah A.
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5880 - 5894
  • [28] Meta-learning of Text Classification Tasks
    Madrid, Jorge G.
    Jair Escalante, Hugo
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS (CIARP 2019), 2019, 11896 : 107 - 119
  • [29] Semi-Supervised Learning for ECG Classification
    Rodrigues, Rui
    Couto, Paula
    2021 COMPUTING IN CARDIOLOGY (CINC), 2021,
  • [30] Augmentation Learning for Semi-Supervised Classification
    Frommknecht, Tim
    Zipf, Pedro Alves
    Fan, Quanfu
    Shvetsova, Nina
    Kuehne, Hilde
    PATTERN RECOGNITION, DAGM GCPR 2022, 2022, 13485 : 85 - 98