Improving Neural Topic Models with Wasserstein Knowledge Distillation

被引:1
|
作者
Adhya, Suman [1 ]
Sanyal, Debarshi Kumar [1 ]
机构
[1] Indian Assoc Cultivat Sci, Jadavpur 700032, India
关键词
Topic modeling; Knowledge distillation; Wasserstein distance; Contextualized topic model; Variational autoencoder;
D O I
10.1007/978-3-031-28238-6_21
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Topic modeling is a dominant method for exploring document collections on the web and in digital libraries. Recent approaches to topic modeling use pretrained contextualized language models and variational autoencoders. However, large neural topic models have a considerable memory footprint. In this paper, we propose a knowledge distillation framework to compress a contextualized topicmodel without loss in topic quality. In particular, the proposed distillation objective is to minimize the cross-entropy of the soft labels produced by the teacher and the student models, as well as to minimize the squared 2-Wasserstein distance between the latent distributions learned by the two models. Experiments on two publicly available datasets show that the student trained with knowledge distillation achieves topic coherence much higher than that of the original student model, and even surpasses the teacher while containing far fewer parameters than the teacher. The distilled model also outperforms several other competitive topic models on topic coherence.
引用
收藏
页码:321 / 330
页数:10
相关论文
共 50 条
  • [1] Improving Neural Topic Models using Knowledge Distillation
    Hoyle, Alexander
    Goel, Pranav
    Resnik, Philip
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1752 - 1771
  • [2] Improving Sliced Wasserstein Distance with Geometric Median for Knowledge Distillation
    Lu, Hongyun
    Zhang, Mengmeng
    Jing, Hongyuan
    Liu, Zhi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (07) : 890 - 893
  • [3] Improving the Interpretability of Deep Neural Networks with Knowledge Distillation
    Liu, Xuan
    Wang, Xiaoguang
    Matwin, Stan
    [J]. 2018 18TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2018, : 905 - 912
  • [4] Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation
    Zhang, Songming
    Liang, Yunlong
    Wang, Shuaibo
    Chen, Yufeng
    Han, Wenjuan
    Liu, Jian
    Xu, Jinan
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 8062 - 8079
  • [5] Improving neural ordinary differential equations via knowledge distillation
    Chu, Haoyu
    Wei, Shikui
    Lu, Qiming
    Zhao, Yao
    [J]. IET COMPUTER VISION, 2024, 18 (02) : 304 - 314
  • [6] Improving Robustness of Compressed Models with Weight Sharing through Knowledge Distillation
    Gourtani, Saeed Khalilian
    Meratnia, Nirvana
    [J]. 2024 IEEE 10TH INTERNATIONAL CONFERENCE ON EDGE COMPUTING AND SCALABLE CLOUD, EDGECOM 2024, 2024, : 13 - 21
  • [7] Diversity-Aware Coherence Loss for Improving Neural Topic Models
    Li, Raymond
    Gonzalez-Pizarro, Felipe
    Xing, Linzi
    Murray, Gabriel
    Carenini, Giuseppe
    [J]. 61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1710 - 1722
  • [8] Performance-Aware Mutual Knowledge Distillation for Improving Neural Architecture Search
    Xie, Pengtao
    Du, Xuefeng
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11912 - 11922
  • [9] Improving Route Choice Models by Incorporating Contextual Factors via Knowledge Distillation
    Liu, Qun
    Mukhopadhyay, Supratik
    Zhu, Yimin
    Gudishala, Ravindra
    Saeidi, Sanaz
    Nabijiang, Alimire
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [10] Improving Knowledge Distillation With a Customized Teacher
    Tan, Chao
    Liu, Jie
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (02) : 2290 - 2299