Nonparametric Topic Modeling with Neural Inference

被引：8

作者：

Ning, Xuefei ^{[1
]}

Zheng, Yin ^{[2
]}

Jiang, Zhuxi ^{[3
]}

Wang, Yu ^{[1
]}

Yang, Huazhong ^{[1
]}

Huang, Junzhou ^{[4
]}

Zhao, Peilin ^{[4
]}

机构：

[1] Tsinghua Univ, Beijing, Peoples R China

[2] Tencent, Weixin Grp, Shenzhen, Peoples R China

[3] Momenta, Beijing, Peoples R China

[4] Tencent AI Lab, Shenzhen, Peoples R China

来源：

NEUROCOMPUTING | 2020年 / 399卷 / 399期

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1016/j.neucom.2019.12.128

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This work focuses on combining nonparametric topic models with Auto-Encoding Variational Bayes (AEVB). Specifically, we first propose iTM-VAE, where the topics are treated as trainable parameters and the document-specific topic proportions are obtained by a stick-breaking construction. The inference of iTM-VAE is modeled by neural networks such that it can be computed in a simple feed-forward manner. We also describe how to introduce a hyper-prior into iTM-VAE so as to model the uncertainty of the prior parameter. Actually, the hyper-prior technique is quite general and we show that it can be applied to other AEVB based models to alleviate the collapse-to-prior problem elegantly. Moreover, we also propose HiTM-VAE, where the document-specific topic distributions are generated in a hierarchical manner. HiTM-VAE is even more flexible and can generate topic representations with better variability and sparsity. Experimental results on 20News and Reuters RCV1-V2 datasets show that the proposed models outperform the state-of-the-art baselines significantly. The advantages of the hyper-prior technique and the hierarchical model construction are also confirmed by experiments. (c) 2020 Elsevier B.V. All rights reserved.

引用

页码：296 / 306

页数：11

共 50 条

[21] Efficient Inference for Dynamic Topic Modeling with Large Vocabularies
Tomasi, Federico
Lalmas, Mounia
Dai, Zhenwen
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, VOL 180, 2022, 180 : 1950 - 1959
[22] Topic Modeling on Health Journals with Regularized Variational Inference
Giaquinto, Robert
Banerjee, Arindam
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3021 - 3028
[23] Neural Variational Correlated Topic Modeling
Liu, Luyang
Huang, Heyan
Gao, Yang
Wei, Xiaochi
Zhang, Yongfeng
WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 1142 - 1152
[24] Lifelong Hierarchical Topic Modeling via Nonparametric Word Embedding Clustering
Yan, Jiaxing
Lu, Yuyin
Chen, Hegang
Yu, Jianxing
Rao, Yanghui
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES-RESEARCH TRACK AND DEMO TRACK, PT VIII, ECML PKDD 2024, 2024, 14948 : 270 - 287
[25] TAN-NTM: Topic Attention Networks for Neural Topic Modeling
Panwar, Madhur
Shailabh, Shashank
Aggarwal, Milan
Krishnamurthy, Balaji
59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 3865 - 3880
[26] A Study on Stochastic Variational Inference for Topic Modeling with Word Embeddings
Ozaki, Kana
Kobayashie, Ichiro
COMPUTACION Y SISTEMAS, 2022, 26 (03): : 1225 - 1232
[27] Hierarchical neural topic modeling with manifold regularization
Ziye Chen
Cheng Ding
Yanghui Rao
Haoran Xie
Xiaohui Tao
Gary Cheng
Fu Lee Wang
World Wide Web, 2021, 24 : 2139 - 2160
[28] Coherence-Aware Neural Topic Modeling
Ding, Ran
Nallapati, Ramesh
Xiang, Bing
2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 830 - 836
[29] Leveraging spiking neural networks for topic modeling
Bialas, Marcin
Mironczuk, Marcin Michal
Mandziuk, Jacek
NEURAL NETWORKS, 2024, 178
[30] Neural Topic Modeling with Bidirectional Adversarial Training
Wang, Rui
Hu, Xuemeng
Zhou, Deyu
He, Yulan
Xiong, Yuxuan
Ye, Chenchen
Xu, Haiyang
58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 340 - 350

← 1 2 3 4 5 →