Enhancing Domain Word Embedding via Latent Semantic Imputation

被引:9
|
作者
Yao, Shibo [1 ]
Yu, Dantong [1 ]
Xiao, Keli [2 ]
机构
[1] New Jersey Inst Technol, Newark, NJ 07102 USA
[2] SUNY Stony Brook, Stony Brook, NY 11794 USA
关键词
representation learning; graph; manifold learning; spectral methods; DIMENSIONALITY REDUCTION;
D O I
10.1145/3292500.3330926
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present a novel method named Latent Semantic Imputation (LSI) to transfer external knowledge into semantic space for enhancing word embedding. The method integrates graph theory to extract the latent manifold structure of the entities in the affinity space and leverages non-negative least squares with standard simplex constraints and power iteration method to derive spectral embeddings. It provides an effective and efficient approach to combining entity representations defined in different Euclidean spaces. Specifically, our approach generates and imputes reliable embedding vectors for low-frequency words in the semantic space and benefits downstream language tasks that depend on word embedding. We conduct comprehensive experiments on a carefully designed classification problem and language modeling and demonstrate the superiority of the enhanced embedding via LSI over several well-known benchmark embeddings. We also confirm the consistency of the results under different parameter settings of our method.
引用
收藏
页码:557 / 565
页数:9
相关论文
共 50 条
  • [31] Word Semantic Similarity Research Based on Latent Relationships
    Lin, Xiaoqing
    Wang, Danling
    2013 2ND INTERNATIONAL SYMPOSIUM ON INSTRUMENTATION AND MEASUREMENT, SENSOR NETWORK AND AUTOMATION (IMSNA), 2013, : 168 - 171
  • [32] An Examination of Word Stemming in Latent Semantic Index Searches
    Perkins, Louise
    Sallis, David E.
    Yenduri, Sumanth
    GLOBAL TRENDS IN INFORMATION SYSTEMS AND SOFTWARE APPLICATIONS, PT 2, 2012, 270 : 1 - 4
  • [33] Discovering latent target subdomains for domain adaptive semantic segmentation via style clustering
    Li, Ang
    Wang, Shengsheng
    Zhao, Xin
    Chen, Juan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (03) : 7785 - 7809
  • [34] Discovering latent target subdomains for domain adaptive semantic segmentation via style clustering
    Ang Li
    Shengsheng Wang
    Xin Zhao
    Juan Chen
    Multimedia Tools and Applications, 2024, 83 : 7785 - 7809
  • [35] Node Embedding via Word Embedding for Network Community Discovery
    Ding, Weicong
    Lin, Christy
    Ishwar, Prakash
    IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, 2017, 3 (03): : 539 - 552
  • [36] Enhancing Query Expansion Method Using Word Embedding
    Yusuf, Nuhu
    Yunus, Mohd Amin Mohd
    Wahid, Norfaradilla
    Wahid, Noorhaniza
    Nawi, Nazri Mohd
    Samsudin, Noor Azah
    2019 IEEE 9TH INTERNATIONAL CONFERENCE ON SYSTEM ENGINEERING AND TECHNOLOGY (ICSET), 2019, : 232 - 235
  • [37] Enhancing knowledge graph embedding with structure and semantic features
    Yalin Wang
    Yubin Peng
    Jingyu Guo
    Applied Intelligence, 2024, 54 : 2900 - 2914
  • [38] Enhancing knowledge graph embedding with structure and semantic features
    Wang, Yalin
    Peng, Yubin
    Guo, Jingyu
    APPLIED INTELLIGENCE, 2024, 54 (03) : 2900 - 2914
  • [39] Scalable Supervised Asymmetric Hashing With Semantic and Latent Factor Embedding
    Zhang, Zheng
    Lai, Zhihui
    Huang, Zi
    Wong, Wai Keung
    Xie, Guo-Sen
    Liu, Li
    Shao, Ling
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (10) : 4803 - 4818
  • [40] Multi-lingual Common Semantic Space Construction via Cluster-consistent Word Embedding
    Huang, Lifu
    Cho, Kyunghyun
    Zhang, Boliang
    Ji, Heng
    Knight, Kevin
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 250 - 260