A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings

被引:3
|
作者
Zhu, Lixing [1 ]
He, Yulan [1 ]
Zhou, Deyu [2 ]
机构
[1] Univ Warwick, Dept Comp Sci, Warwick, England
[2] Southeast Univ, Sch Comp Sci & Engn, Key Lab Comp Network & Informat Integrat, Minist Educ, Nanjing, Peoples R China
基金
英国工程与自然科学研究理事会; 中国国家自然科学基金;
关键词
Natural language processing systems - Semantics;
D O I
10.1162/tacl_a_00326
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel generative model to explore both local and global context for joint learning topics and topic-specific word embeddings. In particular, we assume that global latent topics are shared across documents, a word is generated by a hidden semantic vector encoding its contextual semanticmeaning, and its context words are generated conditional on both the hidden semantic vector and global latent topics. Topics are trained jointly with theword embeddings. The trained model maps words to topic-dependent embeddings, which naturally addresses the issue of word polysemy. Experimental results show that the proposed model outperforms the word-level embedding methods in both word similarity evaluation and word sense disambiguation. Furthermore, the model also extracts more coherent topics compared with existing neural topic models or other models for joint learning of topics and word embeddings. Finally, the model can be easily integrated with existing deep contextualized word embedding learning methods to further improve the performance of downstream tasks such as sentiment classification.
引用
收藏
页码:471 / 485
页数:15
相关论文
共 50 条
  • [1] Lifelong Learning of Topics and Domain-Specific Word Embeddings
    Qin, Xiaorui
    Lu, Yuyin
    Chen, Yufu
    Rao, Yanghui
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2294 - 2309
  • [2] Weakly supervised topic sentiment joint model with word embeddings
    Fu, Xianghua
    Sun, Xudong
    Wu, Haiying
    Cui, Laizhong
    Huang, Joshua Zhexue
    [J]. KNOWLEDGE-BASED SYSTEMS, 2018, 147 : 43 - 54
  • [3] A neural generative autoencoder for bilingual word embeddings
    Su, Jinsong
    Wu, Shan
    Zhang, Biao
    Wu, Changxing
    Qin, Yue
    Xiong, Deyi
    [J]. INFORMATION SCIENCES, 2018, 424 : 287 - 300
  • [4] Jointly Learning Word Embeddings and Latent Topics
    Shi, Bei
    Lam, Wai
    Jameel, Shoaib
    Schockaert, Steven
    Lai, Kwun Ping
    [J]. SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 375 - 384
  • [5] TIDM: Topic-Specific Information Detection Model
    Xu, Wen
    He, Jing
    Mao, Bo
    Li, Youtao
    Liu, Peiqun
    Zhang, Zhiwang
    Cao, Jie
    [J]. 5TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT, ITQM 2017, 2017, 122 : 229 - 236
  • [6] Improving biterm topic model with word embeddings
    Jiajia Huang
    Min Peng
    Pengwei Li
    Zhiwei Hu
    Chao Xu
    [J]. World Wide Web, 2020, 23 : 3099 - 3124
  • [7] Joint Learning of Character and Word Embeddings
    Chen, Xinxiong
    Xu, Lei
    Liu, Zhiyuan
    Sun, Maosong
    Luan, Huanbo
    [J]. PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 1236 - 1242
  • [8] A Correlated Topic Model Using Word Embeddings
    Xun, Guangxu
    Li, Yaliang
    Zhao, Wayne Xin
    Gao, Jing
    Zhang, Aidong
    [J]. PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4207 - 4213
  • [9] Improving biterm topic model with word embeddings
    Huang, Jiajia
    Peng, Min
    Li, Pengwei
    Hu, Zhiwei
    Xu, Chao
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2020, 23 (06): : 3099 - 3124
  • [10] Joint Learning of Sense and Word Embeddings
    Alsuhaibani, Mohammed
    Bollegala, Danushka
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 223 - 229