Topical Word Embeddings

被引:0
|
作者
Liu, Yang [1 ]
Liu, Zhiyuan [1 ]
Chua, Tat-Seng [2 ]
Sun, Maosong [1 ,3 ]
机构
[1] Tsinghua Univ, Natl Lab Informat Sci & Technol, State Key Lab Intelligent Technol & Syst, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[2] Natl Univ Singapore, Sch Comp, Singapore, Singapore
[3] Jiangsu Collaborat Innovat Ctr Language Competenc, Nanjing 221009, Jiangsu, Peoples R China
基金
新加坡国家研究基金会; 中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most word embedding models typically represent each word using a single vector, which makes these models indiscriminative for ubiquitous homonymy and polysemy. In order to enhance discriminativeness, we employ latent topic models to assign topics for each word in the text corpus, and learn topical word embeddings (TWE) based on both words and their topics. In this way, contextual word embeddings can be flexibly obtained to measure contextual word similarity. We can also build document representations, which are more expressive than some widely-used document models such as latent topic models. In the experiments, we evaluate the TWE models on two tasks, contextual word similarity and text classification. The experimental results show that our models outperform typical word embedding models including the multi-prototype version on contextual word similarity, and also exceed latent topic models and other representative document models on text classification. The source code of this paper can be obtained from https://github.com/largelymfs/topical_word_embeddings.
引用
收藏
页码:2418 / 2424
页数:7
相关论文
共 50 条
  • [21] Eigenwords: Spectral Word Embeddings
    Dhillon, Paramveer S.
    Foster, Dean P.
    Ungar, Lyle H.
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2015, 16 : 3035 - 3078
  • [22] Revisiting Supervised Word Embeddings
    Vu, Dieu
    Truong, Khang
    Nguyen, Khanh
    Van Linh, Ngo
    Than, Khoat
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2022, 38 (02) : 413 - 427
  • [23] Ontology Matching with Word Embeddings
    Zhang, Yuanzhe
    Wang, Xuepeng
    Lai, Siwei
    He, Shizhu
    Liu, Kang
    Zhao, Jun
    Lv, Xueqiang
    [J]. CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2014, 2014, 8801 : 34 - 45
  • [24] Word Embeddings with Limited Memory
    Ling, Shaoshi
    Song, Yangqiu
    Roth, Dan
    [J]. PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2016), VOL 2, 2016, : 387 - 392
  • [25] Exploring Numeracy in Word Embeddings
    Naik, Aakanksha
    Ravichander, Abhilasha
    Rose, Carolyn
    Hovy, Eduard
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3374 - 3380
  • [26] Word Embeddings for Comment Coherence
    Cimasa, Alfonso
    Corazza, Anna
    Coviello, Carmen
    Scanniello, Giuseppe
    [J]. 2019 45TH EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS (SEAA 2019), 2019, : 244 - 251
  • [27] Chinese Word Embeddings with Subwords
    Yang, Gang
    Xu, Hongzhe
    Li, Wen
    [J]. 2018 INTERNATIONAL CONFERENCE ON ALGORITHMS, COMPUTING AND ARTIFICIAL INTELLIGENCE (ACAI 2018), 2018,
  • [28] Word Embeddings as Statistical Estimators
    Dey, Neil
    Singer, Matthew
    Williams, Jonathan P.
    Sengupta, Srijan
    [J]. SANKHYA-SERIES B-APPLIED AND INTERDISCIPLINARY STATISTICS, 2024,
  • [29] Word Embeddings for the Polish Language
    Rogalski, Marek
    Szczepaniak, Piotr S.
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2016, 2016, 9692 : 126 - 135
  • [30] Word Embeddings Evaluation and Combination
    Ghannay, Sahar
    Favre, Benoit
    Esteve, Yannick
    Camelin, Nathalie
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 300 - 305