GraphBTM: Graph Enhanced Autoencoded Variational Inference for Biterm Topic Model

被引:0
|
作者
Zhu, Qile [1 ]
Feng, Zheng [1 ]
Li, Xiaolin [1 ]
机构
[1] Univ Florida, NSF Ctr Big Learning, Large Scale Intelligent Syst Lab, Gainesville, FL 32611 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Discovering the latent topics within texts has been a fundamental task for many applications. However, conventional topic models suffer different problems in different settings. The Latent Dirichlet Allocation (LDA) may not work well for short texts due to the data sparsity (i.e., the sparse word co-occurrence patterns in short documents). The Biterm Topic Model (BTM) learns topics by modeling the word-pairs named biterms in the whole corpus. This assumption is very strong when documents are long with rich topic information and do not exhibit the transitivity of biterms. In this paper, we propose a novel way called GraphBTM to represent biterms as graphs and design Graph Convolutional Networks (GCNs) with residual connections to extract transitive features from biterms. To overcome the data sparsity of LDA and the strong assumption of BTM, we sample a fixed number of documents to form a mini-corpus as a training instance. We also propose a dataset called All News extracted from (Thompson, 2017), in which documents are much longer than 20 Newsgroups. We present an amortized variational inference method for GraphBTM. Our method generates more coherent topics compared with previous approaches. Experiments show that the sampling strategy improves performance by a large margin.
引用
收藏
页码:4663 / 4672
页数:10
相关论文
共 50 条
  • [41] Variational Inference for Graph Convolutional Networks in the Absence of Graph Data and Adversarial Settings
    Elinas, Pantelis
    Bonilla, Edwin V.
    Tiao, Louis C.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [42] Variational Topic Inference for Chest X-Ray Report Generation
    Najdenkoska, Ivona
    Zhen, Xiantong
    Worring, Marcel
    Shao, Ling
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT III, 2021, 12903 : 625 - 635
  • [43] Tree-Structured Topic Modeling with Nonparametric Neural Variational Inference
    Chen, Ziye
    Ding, Cheng
    Zhang, Zusheng
    Rao, Yanghui
    Xie, Haoran
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2343 - 2353
  • [44] Modification biterm topic model input feature for detecting topic in thematic virtual museums (vol 14, pg 243, 2018)
    Anggai, S.
    Blekanov, I. S.
    Sergeev, S. L.
    VESTNIK SANKT-PETERBURGSKOGO UNIVERSITETA SERIYA 10 PRIKLADNAYA MATEMATIKA INFORMATIKA PROTSESSY UPRAVLENIYA, 2020, 16 (02): : 213 - 213
  • [45] Amortized Variational Inference with Graph Convolutional Networks for Gaussian Processes
    Liu, Linfeng
    Liu, Li-Ping
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [46] Self-supervised Graph Representation Learning with Variational Inference
    Liao, Zihan
    Liang, Wenxin
    Liu, Han
    Mu, Jie
    Zhang, Xianchao
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT III, 2021, 12714 : 116 - 127
  • [47] Variational Bayes Inference for the DINA Model
    Yamaguchi, Kazuhiro
    Okada, Kensuke
    JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS, 2020, 45 (05) : 569 - 597
  • [48] Inductive Topic Variational Graph Auto-Encoder for Text Classification
    Xie, Qianqian
    Huang, Jimin
    Du, Pan
    Peng, Min
    Nie, Jian-Yun
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 4218 - 4227
  • [49] Constructing Dynamic Topic Models Based on Variational Autoencoder and Factor Graph
    Gou, Zhinan
    Han, Lixin
    Sun, Ling
    Zhu, Jun
    Yan, Hong
    IEEE ACCESS, 2018, 6 : 53102 - 53111
  • [50] Neural Variational Gaussian Mixture Topic Model
    Tang, Kun
    Huang, Heyan
    Shi, Xuewen
    Mao, Xian-Ling
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (04)