Decoupled Word Embeddings using Latent Topics

被引:1
|
作者
Park, Heesoo [1 ]
Lee, Jongwuk [1 ]
机构
[1] Sungkyunkwan Univ, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Multi-sense word embedding; contextualized word embedding; topic modeling;
D O I
10.1145/3341105.3373997
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose decoupled word embeddings (DWE) as a universal word representation that covers multiple senses of words. Toward this goal, our model represents each word as a combination of multiple word vectors that are associated with latent topics. Specifically, we decompose a word vector into multiple word vectors for multiple senses, according to the topic weight obtained from pre-trained topic models. Although this dynamic word representation is simple, the proposed model can leverage both local and global contexts. Through extensive experiments, including qualitative and quantitative analyses, we demonstrate that the proposed model is comparable to or better than state-ofthe-art word embedding models. The code is publicly available at https://github.com/righ120/DWE.
引用
收藏
页码:875 / 882
页数:8
相关论文
共 50 条
  • [21] Text Classification Using Word Embeddings
    Helaskar, Mukund N.
    Sonawane, Sheetal S.
    [J]. 2019 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, CONTROL AND AUTOMATION (ICCUBEA), 2019,
  • [22] Learning Latent Topics from the Word Co-occurrence Network
    Wang, Wu
    Zhou, Houquan
    He, Kun
    Hopcroft, John E.
    [J]. THEORETICAL COMPUTER SCIENCE, NCTCS 2017, 2017, 768 : 18 - 30
  • [23] Stability of Word Embeddings Using Word2Vec
    Chugh, Mansi
    Whigham, Peter A.
    Dick, Grant
    [J]. AI 2018: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, 11320 : 812 - 818
  • [24] Identifying topics by using word distribution
    Nakayama, Motoi
    Miura, Takao
    [J]. 2007 IEEE PACIFIC RIM CONFERENCE ON COMMUNICATIONS, COMPUTERS AND SIGNAL PROCESSING, VOLS 1 AND 2, 2007, : 241 - 244
  • [25] Domain Adaptation for Word Sense Disambiguation Using Word Embeddings
    Komiya, Kanako
    Suzuki, Shota
    Sasaki, Minoru
    Shinnou, Hiroyuki
    Okumura, Manabu
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2017), PT I, 2018, 10761 : 195 - 206
  • [26] A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings
    Zhu, Lixing
    He, Yulan
    Zhou, Deyu
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 : 471 - 485
  • [27] Compressing and interpreting word embeddings with latent space regularization and interactive semantics probing
    Li, Haoyu
    Wang, Junpeng
    Zheng, Yan
    Wang, Liang
    Zhang, Wei
    Shen, Han-Wei
    [J]. INFORMATION VISUALIZATION, 2023, 22 (01) : 52 - 68
  • [28] Unsupervised word embeddings capture latent knowledge from materials science literature
    Vahe Tshitoyan
    John Dagdelen
    Leigh Weston
    Alexander Dunn
    Ziqin Rong
    Olga Kononova
    Kristin A. Persson
    Gerbrand Ceder
    Anubhav Jain
    [J]. Nature, 2019, 571 : 95 - 98
  • [29] Unsupervised word embeddings capture latent knowledge from materials science literature
    Tshitoyan, Vahe
    Dagdelen, John
    Weston, Leigh
    Dunn, Alexander
    Rong, Ziqin
    Kononova, Olga
    Persson, Kristin A.
    Ceder, Gerbrand
    Jain, Anubhav
    [J]. NATURE, 2019, 571 (7763) : 95 - +
  • [30] Automatic keyphrase extraction using word embeddings
    Yuxiang Zhang
    Huan Liu
    Suge Wang
    W. H. Ip.
    Wei Fan
    Chunjing Xiao
    [J]. Soft Computing, 2020, 24 : 5593 - 5608