Improving Neural Topic Models with Wasserstein Knowledge Distillation

被引：1

作者：

Adhya, Suman ^{[1
]}

Sanyal, Debarshi Kumar ^{[1
]}

机构：

[1] Indian Assoc Cultivat Sci, Jadavpur 700032, India

来源：

ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT II | 2023年 / 13981卷

关键词：

Topic modeling; Knowledge distillation; Wasserstein distance; Contextualized topic model; Variational autoencoder;

D O I：

10.1007/978-3-031-28238-6_21

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Topic modeling is a dominant method for exploring document collections on the web and in digital libraries. Recent approaches to topic modeling use pretrained contextualized language models and variational autoencoders. However, large neural topic models have a considerable memory footprint. In this paper, we propose a knowledge distillation framework to compress a contextualized topicmodel without loss in topic quality. In particular, the proposed distillation objective is to minimize the cross-entropy of the soft labels produced by the teacher and the student models, as well as to minimize the squared 2-Wasserstein distance between the latent distributions learned by the two models. Experiments on two publicly available datasets show that the student trained with knowledge distillation achieves topic coherence much higher than that of the original student model, and even surpasses the teacher while containing far fewer parameters than the teacher. The distilled model also outperforms several other competitive topic models on topic coherence.

引用

页码：321 / 330

页数：10

共 50 条

[41] Improving Knowledge Distillation via Head and Tail Categories
Xu, Liuchi
Ren, Jin
Huang, Zhenhua
Zheng, Weishi
Chen, Yunwen
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3465 - 3480
[42] Improving the accuracy of pruned network using knowledge distillation
Prakosa, Setya Widyawan
Leu, Jenq-Shiou
Chen, Zhao-Hong
[J]. PATTERN ANALYSIS AND APPLICATIONS, 2021, 24 (02) : 819 - 830
[43] Improving the accuracy of pruned network using knowledge distillation
Setya Widyawan Prakosa
Jenq-Shiou Leu
Zhao-Hong Chen
[J]. Pattern Analysis and Applications, 2021, 24 : 819 - 830
[44] Benchmarking Neural Topic Models: An Empirical Study
Thanh-Nam Doan
Tuan-Anh Hoang
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4363 - 4368
[45] Knowledge Distillation: Bad Models Can Be Good Role Models
Kaplun, Gal
Malach, Eran
Nakkiran, Preetum
Shalev-Shwartz, Shai
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[46] Improving the accuracy of mechanistic models for dynamic batch distillation enabled by neural network: An industrial plant case
Xiaoyu Zhou
Xiangyi Gao
Mingmei Wang
Erwei Song
Erqiang Wang
[J]. Chinese Journal of Chemical Engineering, 2024, 73 (09) : 290 - 300
[47] Improving the accuracy of mechanistic models for dynamic batch distillation enabled by neural network: An industrial plant case
Zhou, Xiaoyu
Gao, Xiangyi
Wang, Mingmei
Song, Erwei
Wang, Erqiang
[J]. CHINESE JOURNAL OF CHEMICAL ENGINEERING, 2024, 73 : 290 - 300
[48] Neural Machine Translation with Heterogeneous Topic Knowledge Embeddings
Wang, Weixuan
Peng, Wei
Zhang, Meng
Liu, Qun
[J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3197 - 3202
[49] Translating with Bilingual Topic Knowledge for Neural Machine Translation
Wei, Xiangpeng
Hu, Yue
Xing, Luxi
Wang, Yipeng
Gao, Li
[J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 7257 - 7264
[50] On using neural networks models for distillation control
Munsif, HP
Riggs, JB
[J]. DISTILLATION AND ABSORPTION '97, VOLS 1 AND 2, 1997, (142): : 259 - 268

← 1 2 3 4 5 →