Latent-Variable Generative Models for Data-Efficient Text Classification

被引：0

作者：

Ding, Xiaoan ^{[1
]}

Gimpel, Kevin ^{[2
]}

机构：

[1] Univ Chicago, Chicago, IL 60637 USA

[2] Toyota Technol Inst Chicago, Chicago, IL 60637 USA

来源：

2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE | 2019年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Generative classifiers offer potential advantages over their discriminative counterparts, namely in the areas of data efficiency, robustness to data shift and adversarial examples, and zero-shot learning (Ng and Jordan, 2002; Yogatama et al., 2017; Lewis and Fan, 2019). In this paper, we improve generative text classifiers by introducing discrete latent variables into the generative story, and explore several graphical model configurations. We parameterize the distributions using standard neural architectures used in conditional language modeling and perform learning by directly maximizing the log marginal likelihood via gradient-based optimization, which avoids the need to do expectation-maximization. We empirically characterize the performance of our models on six text classification datasets. The choice of where to include the latent variable has a significant impact on performance, with the strongest results obtained when using the latent variable as an auxiliary conditioning variable in the generation of the textual input. This model consistently outperforms both the generative and discriminative classifiers in small-data settings. We analyze our model by using it for controlled generation, finding that the latent variable captures interpretable properties of the data, even with very small training sets.

引用

页码：507 / 517

页数：11

共 50 条

[41] Latent variable discovery in classification models
Zhang, NL
Nielsen, TD
Jensen, FV
[J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2004, 30 (03) : 283 - 299
[42] Data-Efficient Reinforcement Learning for Variable Impedance Control
Anand, Akhil S.
Kaushik, Rituraj
Gravdahl, Jan Tommy
Abu-Dakka, Fares J.
[J]. IEEE ACCESS, 2024, 12 : 15631 - 15641
[43] Using latent-variable models to analyze smoking cessation clinical trial data: An example among the methadone maintained
Frosch, DL
Stein, JA
Shoptaw, S
[J]. EXPERIMENTAL AND CLINICAL PSYCHOPHARMACOLOGY, 2002, 10 (03) : 258 - 267
[44] Enhancing Text Classification Models with Generative AI-aided Data Augmentation
Zhao, Huanhuan
Chen, Haihua
Yoon, Hong-Jun
[J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING, AITEST, 2023, : 138 - 145
[45] Avoiding Latent Variable Collapse with Generative Skip Models
Dieng, Adji B.
Kim, Yoon
Rush, Alexander M.
Blei, David M.
[J]. 22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
[46] Identifiability of latent-variable and structural-equation models: from linear to nonlinear
Aapo Hyvärinen
Ilyes Khemakhem
Ricardo Monti
[J]. Annals of the Institute of Statistical Mathematics, 2024, 76 : 1 - 33
[47] Graph learning for latent-variable Gaussian graphical models under laplacian constraints
Li, Ran
Lin, Jiming
Qiu, Hongbing
Zhang, Wenhui
Wang, Junyi
[J]. NEUROCOMPUTING, 2023, 532 : 67 - 76
[48] On Latent-Variable Model Misspecification in Structural Measurement Error Models for Binary Response
Huang, Xianzheng
Tebbs, Joshua M.
[J]. BIOMETRICS, 2009, 65 (03) : 710 - 718
[49] Cognitive preconditions of early reading and spelling: a latent-variable approach with longitudinal data
Anna-Lena Preßler
Tanja Könen
Marcus Hasselhorn
Kristin Krajewski
[J]. Reading and Writing, 2014, 27 : 383 - 406
[50] A latent-variable marginal method for multi-level incomplete binary data
Chen, Baojiang
Zhou, Xiao-Hua
[J]. STATISTICS IN MEDICINE, 2012, 31 (26) : 3211 - 3222

← 1 2 3 4 5 →