Improving Neural Topic Models with Wasserstein Knowledge Distillation

被引：1

作者：

Adhya, Suman ^{[1
]}

Sanyal, Debarshi Kumar ^{[1
]}

机构：

[1] Indian Assoc Cultivat Sci, Jadavpur 700032, India

来源：

ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT II | 2023年 / 13981卷

关键词：

Topic modeling; Knowledge distillation; Wasserstein distance; Contextualized topic model; Variational autoencoder;

D O I：

10.1007/978-3-031-28238-6_21

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Topic modeling is a dominant method for exploring document collections on the web and in digital libraries. Recent approaches to topic modeling use pretrained contextualized language models and variational autoencoders. However, large neural topic models have a considerable memory footprint. In this paper, we propose a knowledge distillation framework to compress a contextualized topicmodel without loss in topic quality. In particular, the proposed distillation objective is to minimize the cross-entropy of the soft labels produced by the teacher and the student models, as well as to minimize the squared 2-Wasserstein distance between the latent distributions learned by the two models. Experiments on two publicly available datasets show that the student trained with knowledge distillation achieves topic coherence much higher than that of the original student model, and even surpasses the teacher while containing far fewer parameters than the teacher. The distilled model also outperforms several other competitive topic models on topic coherence.

引用

页码：321 / 330

页数：10

共 50 条

[1] Improving Neural Topic Models using Knowledge Distillation
Hoyle, Alexander
Goel, Pranav
Resnik, Philip
[J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1752 - 1771
[2] Improving Sliced Wasserstein Distance with Geometric Median for Knowledge Distillation
Lu, Hongyun
Zhang, Mengmeng
Jing, Hongyuan
Liu, Zhi
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (07) : 890 - 893
[3] Improving the Interpretability of Deep Neural Networks with Knowledge Distillation
Liu, Xuan
Wang, Xiaoguang
Matwin, Stan
[J]. 2018 18TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2018, : 905 - 912
[4] Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation
Zhang, Songming
Liang, Yunlong
Wang, Shuaibo
Chen, Yufeng
Han, Wenjuan
Liu, Jian
Xu, Jinan
[J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 8062 - 8079
[5] Improving neural ordinary differential equations via knowledge distillation
Chu, Haoyu
Wei, Shikui
Lu, Qiming
Zhao, Yao
[J]. IET COMPUTER VISION, 2024, 18 (02) : 304 - 314
[6] Improving Robustness of Compressed Models with Weight Sharing through Knowledge Distillation
Gourtani, Saeed Khalilian
Meratnia, Nirvana
[J]. 2024 IEEE 10TH INTERNATIONAL CONFERENCE ON EDGE COMPUTING AND SCALABLE CLOUD, EDGECOM 2024, 2024, : 13 - 21
[7] Diversity-Aware Coherence Loss for Improving Neural Topic Models
Li, Raymond
Gonzalez-Pizarro, Felipe
Xing, Linzi
Murray, Gabriel
Carenini, Giuseppe
[J]. 61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1710 - 1722
[8] Performance-Aware Mutual Knowledge Distillation for Improving Neural Architecture Search
Xie, Pengtao
Du, Xuefeng
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11912 - 11922
[9] Improving Route Choice Models by Incorporating Contextual Factors via Knowledge Distillation
Liu, Qun
Mukhopadhyay, Supratik
Zhu, Yimin
Gudishala, Ravindra
Saeidi, Sanaz
Nabijiang, Alimire
[J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[10] Improving Knowledge Distillation With a Customized Teacher
Tan, Chao
Liu, Jie
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (02) : 2290 - 2299

← 1 2 3 4 5 →