Enhancing Domain Word Embedding via Latent Semantic Imputation

被引:9
|
作者
Yao, Shibo [1 ]
Yu, Dantong [1 ]
Xiao, Keli [2 ]
机构
[1] New Jersey Inst Technol, Newark, NJ 07102 USA
[2] SUNY Stony Brook, Stony Brook, NY 11794 USA
关键词
representation learning; graph; manifold learning; spectral methods; DIMENSIONALITY REDUCTION;
D O I
10.1145/3292500.3330926
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present a novel method named Latent Semantic Imputation (LSI) to transfer external knowledge into semantic space for enhancing word embedding. The method integrates graph theory to extract the latent manifold structure of the entities in the affinity space and leverages non-negative least squares with standard simplex constraints and power iteration method to derive spectral embeddings. It provides an effective and efficient approach to combining entity representations defined in different Euclidean spaces. Specifically, our approach generates and imputes reliable embedding vectors for low-frequency words in the semantic space and benefits downstream language tasks that depend on word embedding. We conduct comprehensive experiments on a carefully designed classification problem and language modeling and demonstrate the superiority of the enhanced embedding via LSI over several well-known benchmark embeddings. We also confirm the consistency of the results under different parameter settings of our method.
引用
收藏
页码:557 / 565
页数:9
相关论文
共 50 条
  • [1] Enhancing Semantic Word Representations by Embedding Deep Word Relationships
    Nugaliyadde, Anupiya
    Wong, Kok Wai
    Sohel, Ferdous
    Xie, Hong
    PROCEEDINGS OF 2019 11TH INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2019), 2019, : 82 - 87
  • [2] Evaluation of Word Embedding via Domain Keywords
    Fu, Qunchao
    Li, Zongyang
    Han, Xu
    Wang, Cong
    PROCEEDINGS OF 2018 5TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2018, : 290 - 294
  • [3] Domain-specific meta-embedding with latent semantic structures
    Liu, Qian
    Lu, Jie
    Zhang, Guangquan
    Shen, Tao
    Zhang, Zhihan
    Huang, Heyan
    INFORMATION SCIENCES, 2021, 555 : 410 - 423
  • [4] SenSE: A Toolkit for Semantic Change Exploration via Word Embedding Alignment
    Gruppi, Mauricio
    Adali, Sibel
    Chen, Pin-Yu
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 13170 - 13172
  • [5] Evolutions of semantic consistency in research topic via contextualized word embedding
    Huang, Shengzhi
    Lu, Wei
    Cheng, Qikai
    Luo, Zhuoran
    Huang, Yong
    INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (06)
  • [6] Lifelong Domain Word Embedding via Meta-Learning
    Xu, Hu
    Liu, Bing
    Shu, Lei
    Yu, Philip S.
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 4510 - 4516
  • [7] Enhancing Latent Semantic Analysis by Embedding Tagging Algorithm in Retrieving Malay Text Documents
    Abd Rahman, Nurazzah
    Soom, Afiqah Bazlla Md
    Ismail, Normaly Kamal
    ADVANCED TOPICS IN INTELLIGENT INFORMATION AND DATABASE SYSTEMS, 2017, 710 : 309 - 319
  • [8] Embedding Semantic Relations into Word Representations
    Bollegala, Danushka
    Maehara, Takanori
    Kawarabayashi, Ken-ichi
    PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 1222 - 1228
  • [9] Word completion with latent semantic analysis
    Miller, Tristan
    Wolf, Elisabeth
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2006, : 1252 - +
  • [10] Study on the Chinese Word Semantic Relation Classification with Word Embedding
    Shijia, E.
    Jia, Shengbin
    Xiang, Yang
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 849 - 855