Adversarial training for few-shot text classification

被引：2

作者：

Croce, Danilo ^{[1
]}

Castellucci, Giuseppe ^{[1
]}

Basili, Roberto ^{[1
]}

机构：

[1] Univ Roma Tor Vergata, Dept Enterprise Engn, Rome, RM, Italy

来源：

INTELLIGENZA ARTIFICIALE | 2020年 / 14卷 / 02期

关键词：

Semi-supervised learning; generative adversarial network; kernel-based embedding spaces; universal sentence encoding; NYSTROM METHOD;

D O I：

10.3233/IA-200051

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, Deep Learning methods have become very popular in classification tasks for Natural Language Processing (NLP); this is mainly due to their ability to reach high performances by relying on very simple input representations, i.e., raw tokens. One of the drawbacks of deep architectures is the large amount of annotated data required for an effective training. Usually, in Machine Learning this problem is mitigated by the usage of semi-supervised methods or, more recently, by using Transfer Learning, in the context of deep architectures. One recent promising method to enable semi-supervised learning in deep architectures has been formalized within Semi-Supervised Generative Adversarial Networks (SS-GANs) in the context of Computer Vision. In this paper, we adopt the SS-GAN framework to enable semi-supervised learning in the context of NLP. We demonstrate how an SS-GAN can boost the performances of simple architectures when operating in expressive low-dimensional embeddings; these are derived by combining the unsupervised approximation of linguistic Reproducing Kernel Hilbert Spaces and the so-called Universal Sentence Encoders. We experimentally evaluate the proposed approach over a semantic classification task, i.e., Question Classification, by considering different sizes of training material and different numbers of target classes. By applying such adversarial schema to a simple Multi-Layer Perceptron, a classifier trained over a subset derived from 1% of the original training material achieves 92% of accuracy. Moreover, when considering a complex classification schema, e.g., involving 50 classes, the proposed method outperforms state-of-the-art alternatives such as BERT.

引用

页码：201 / 214

页数：14

共 50 条

[1] Causal representation for few-shot text classification
Yang, Maoqin
Zhang, Xuejie
Wang, Jin
Zhou, Xiaobing
[J]. APPLIED INTELLIGENCE, 2023, 53 (18) : 21422 - 21432
[2] Few-shot learning for short text classification
Yan, Leiming
Zheng, Yuhui
Cao, Jie
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (22) : 29799 - 29810
[3] Few-shot learning for short text classification
Leiming Yan
Yuhui Zheng
Jie Cao
[J]. Multimedia Tools and Applications, 2018, 77 : 29799 - 29810
[4] Causal representation for few-shot text classification
Maoqin Yang
Xuejie Zhang
Jin Wang
Xiaobing Zhou
[J]. Applied Intelligence, 2023, 53 : 21422 - 21432
[5] Continual Few-Shot Learning for Text Classification
Pasunuru, Ramakanth
Stoyanov, Veselin
Bansal, Mohit
[J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 5688 - 5702
[6] Induction Networks for Few-Shot Text Classification
Geng, Ruiying
Li, Binhua
Li, Yongbin
Zhu, Xiaodan
Jian, Ping
Sun, Jian
[J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3904 - 3913
[7] Meta-Learning Adversarial Domain Adaptation Network for Few-Shot Text Classification
Han, ChengCheng
Fan, Zeqiu
Zhang, Dongxiang
Qiu, Minghui
Gao, Ming
Zhou, Aoying
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1664 - 1673
[8] Label Semantic Aware Pre-training for Few-shot Text Classification
Mueller, Aaron
Krone, Jason
Romeo, Salvatore
Mansour, Saab
Mansimov, Elman
Zhang, Yi
Roth, Dan
[J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 8318 - 8334
[9] Few-Shot and Prompt Training for Text Classification in German Doctor's Letters
Richter-Pechanski, Phillip
Wiesenbach, Philipp
Schwa, Dominic M.
Kiriakou, Christina
He, Mingyang
Geis, Nicolas A.
Frank, Anette
Dieterich, Christoph
[J]. CARING IS SHARING-EXPLOITING THE VALUE IN DATA FOR HEALTH AND INNOVATION-PROCEEDINGS OF MIE 2023, 2023, 302 : 819 - 820
[10] Uncertainty-aware Self-training for Few-shot Text Classification
Mukherjee, Subhabrata
Awadallah, Ahmed Hassan
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33

← 1 2 3 4 5 →