Adversarial training for few-shot text classification

被引:2
|
作者
Croce, Danilo [1 ]
Castellucci, Giuseppe [1 ]
Basili, Roberto [1 ]
机构
[1] Univ Roma Tor Vergata, Dept Enterprise Engn, Rome, RM, Italy
关键词
Semi-supervised learning; generative adversarial network; kernel-based embedding spaces; universal sentence encoding; NYSTROM METHOD;
D O I
10.3233/IA-200051
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, Deep Learning methods have become very popular in classification tasks for Natural Language Processing (NLP); this is mainly due to their ability to reach high performances by relying on very simple input representations, i.e., raw tokens. One of the drawbacks of deep architectures is the large amount of annotated data required for an effective training. Usually, in Machine Learning this problem is mitigated by the usage of semi-supervised methods or, more recently, by using Transfer Learning, in the context of deep architectures. One recent promising method to enable semi-supervised learning in deep architectures has been formalized within Semi-Supervised Generative Adversarial Networks (SS-GANs) in the context of Computer Vision. In this paper, we adopt the SS-GAN framework to enable semi-supervised learning in the context of NLP. We demonstrate how an SS-GAN can boost the performances of simple architectures when operating in expressive low-dimensional embeddings; these are derived by combining the unsupervised approximation of linguistic Reproducing Kernel Hilbert Spaces and the so-called Universal Sentence Encoders. We experimentally evaluate the proposed approach over a semantic classification task, i.e., Question Classification, by considering different sizes of training material and different numbers of target classes. By applying such adversarial schema to a simple Multi-Layer Perceptron, a classifier trained over a subset derived from 1% of the original training material achieves 92% of accuracy. Moreover, when considering a complex classification schema, e.g., involving 50 classes, the proposed method outperforms state-of-the-art alternatives such as BERT.
引用
收藏
页码:201 / 214
页数:14
相关论文
共 50 条
  • [1] Causal representation for few-shot text classification
    Yang, Maoqin
    Zhang, Xuejie
    Wang, Jin
    Zhou, Xiaobing
    [J]. APPLIED INTELLIGENCE, 2023, 53 (18) : 21422 - 21432
  • [2] Few-shot learning for short text classification
    Yan, Leiming
    Zheng, Yuhui
    Cao, Jie
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (22) : 29799 - 29810
  • [3] Few-shot learning for short text classification
    Leiming Yan
    Yuhui Zheng
    Jie Cao
    [J]. Multimedia Tools and Applications, 2018, 77 : 29799 - 29810
  • [4] Causal representation for few-shot text classification
    Maoqin Yang
    Xuejie Zhang
    Jin Wang
    Xiaobing Zhou
    [J]. Applied Intelligence, 2023, 53 : 21422 - 21432
  • [5] Continual Few-Shot Learning for Text Classification
    Pasunuru, Ramakanth
    Stoyanov, Veselin
    Bansal, Mohit
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 5688 - 5702
  • [6] Induction Networks for Few-Shot Text Classification
    Geng, Ruiying
    Li, Binhua
    Li, Yongbin
    Zhu, Xiaodan
    Jian, Ping
    Sun, Jian
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3904 - 3913
  • [7] Meta-Learning Adversarial Domain Adaptation Network for Few-Shot Text Classification
    Han, ChengCheng
    Fan, Zeqiu
    Zhang, Dongxiang
    Qiu, Minghui
    Gao, Ming
    Zhou, Aoying
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1664 - 1673
  • [8] Label Semantic Aware Pre-training for Few-shot Text Classification
    Mueller, Aaron
    Krone, Jason
    Romeo, Salvatore
    Mansour, Saab
    Mansimov, Elman
    Zhang, Yi
    Roth, Dan
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 8318 - 8334
  • [9] Few-Shot and Prompt Training for Text Classification in German Doctor's Letters
    Richter-Pechanski, Phillip
    Wiesenbach, Philipp
    Schwa, Dominic M.
    Kiriakou, Christina
    He, Mingyang
    Geis, Nicolas A.
    Frank, Anette
    Dieterich, Christoph
    [J]. CARING IS SHARING-EXPLOITING THE VALUE IN DATA FOR HEALTH AND INNOVATION-PROCEEDINGS OF MIE 2023, 2023, 302 : 819 - 820
  • [10] Uncertainty-aware Self-training for Few-shot Text Classification
    Mukherjee, Subhabrata
    Awadallah, Ahmed Hassan
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33