Latent-Variable Generative Models for Data-Efficient Text Classification

被引：0

作者：

Ding, Xiaoan ^{[1
]}

Gimpel, Kevin ^{[2
]}

机构：

[1] Univ Chicago, Chicago, IL 60637 USA

[2] Toyota Technol Inst Chicago, Chicago, IL 60637 USA

来源：

2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE | 2019年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Generative classifiers offer potential advantages over their discriminative counterparts, namely in the areas of data efficiency, robustness to data shift and adversarial examples, and zero-shot learning (Ng and Jordan, 2002; Yogatama et al., 2017; Lewis and Fan, 2019). In this paper, we improve generative text classifiers by introducing discrete latent variables into the generative story, and explore several graphical model configurations. We parameterize the distributions using standard neural architectures used in conditional language modeling and perform learning by directly maximizing the log marginal likelihood via gradient-based optimization, which avoids the need to do expectation-maximization. We empirically characterize the performance of our models on six text classification datasets. The choice of where to include the latent variable has a significant impact on performance, with the strongest results obtained when using the latent variable as an auxiliary conditioning variable in the generation of the textual input. This model consistently outperforms both the generative and discriminative classifiers in small-data settings. We analyze our model by using it for controlled generation, finding that the latent variable captures interpretable properties of the data, even with very small training sets.

引用

页码：507 / 517

页数：11

共 50 条

[31] Robust latent-variable interpretation of in vivo regression models by nested resampling
Alexander W. Caulk
Kevin A. Janes
[J]. Scientific Reports, 9
[32] Learning General Latent-Variable Graphical Models with Predictive Belief Propagation
Wang, Borui
Gordon, Geoffrey
[J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6118 - 6126
[33] LATENT-VARIABLE MODELS OF POSTHUMOUS REPUTATION - A QUEST FOR GALTONS-G
SIMONTON, DK
[J]. JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY, 1991, 60 (04) : 607 - 619
[34] Progressively volumetrized deep generative models for data-efficient contextual learning of MR image recovery
Yurt, Mahmut
Ozbey, Muzaffer
Dar, Salman U. H.
Tinaz, Berk
Oguz, Kader K.
Cukur, Tolga
[J]. MEDICAL IMAGE ANALYSIS, 2022, 78
[35] An improved test of latent-variable model misspecification in structural measurement error models for group testing data
Huang, Xianzheng
[J]. STATISTICS IN MEDICINE, 2009, 28 (26) : 3316 - 3327
[36] A Continuous-Time, Latent-Variable Model of Time Series Data
Tahk, Alexander M.
[J]. POLITICAL ANALYSIS, 2015, 23 (02) : 278 - 298
[37] Data-efficient Neural Text Compression with Interactive Learning
Avinesh, P. V. S.
Meyer, Christian M.
[J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2543 - 2554
[38] Masked Generative Adversarial Networks are Data-Efficient Generation Learners
Huang, Jiaxing
Cui, Kaiwen
Guan, Dayan
Xiao, Aoran
Zhan, Fangneng
Lu, Shijian
Liao, Shengcai
Xing, Eric
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[39] Data-efficient learning of robotic clothing assistance using Bayesian Gaussian process latent variable model
Koganti, Nishanth
Shibata, Tomohiro
Tamei, Tomoya
Ikeda, Kazushi
[J]. ADVANCED ROBOTICS, 2019, 33 (15-16) : 800 - 814
[40] Efficient decomposition of latent representation in generative models
Nikulin, Vsevolod
Tani, Jun
[J]. 2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 611 - 615

← 1 2 3 4 5 →