Synonym Discovery with Etymology-based Word Embeddings

被引:0
|
作者
Yoon, Seunghyun [1 ]
Estrada, Pablo [1 ]
Jung, Kyomin [1 ,2 ]
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul, South Korea
[2] Seoul Natl Univ, Automat & Syst Res Inst, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel approach to learn word embeddings based on an extended version of the distributional hypothesis. Our model derives word embedding vectors using the etymological composition of words, rather than the context in which they appear. It has the strength of not requiring a large text corpus, but instead it requires reliable access to etymological roots of words, making it specially fit for languages with logographic writing systems. The model consists on three steps: (1) building an etymological graph, which is a bipartite network of words and etymological roots, (2) obtaining the biadjacency matrix of the etymological graph and reducing its dimensionality, (3) using columns/rows of the resulting matrices as embedding vectors. We test our model in the Chinese and Sino-Korean vocabularies. Our graphs are formed by a set of 117,000 Chinese words, and a set of 135,000 Sino-Korean words. In both cases we show that our model performs well in the task of synonym discovery.
引用
收藏
页码:1336 / 1341
页数:6
相关论文
共 50 条
  • [1] Turkish entity discovery with word embeddings
    Kalender, Murat
    Korkmaz, Emin Erkan
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2017, 25 (03) : 2388 - 2398
  • [2] Event-Driven Semantic Service Discovery Based on word Embeddings
    Liu, Fagui
    Deng, Dacheng
    Jiang, Jun
    Tang, Quan
    IEEE ACCESS, 2018, 6 : 61030 - 61038
  • [3] Dynamic Word Embeddings for Evolving Semantic Discovery
    Yao, Zijun
    Sun, Yifan
    Ding, Weicong
    Rao, Nikhil
    Xiong, Hui
    WSDM'18: PROCEEDINGS OF THE ELEVENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2018, : 673 - 681
  • [4] Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction
    Kim Anh Nguyen
    Walde, Sabine Schulte Im
    Ngoc Thang Vu
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2016), VOL 2, 2016, : 454 - 459
  • [5] SynoExtractor: A Novel Pipeline for Arabic Synonym Extraction Using Word2Vec Word Embeddings
    Al-Matham, Rawan N.
    Al-Khalifa, Hend S.
    COMPLEXITY, 2021, 2021
  • [6] Unsupervised Word Segmentation and Lexicon Discovery Using Acoustic Word Embeddings
    Kamper, Herman
    Jansen, Aren
    Goldwater, Sharon
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (04) : 669 - 679
  • [7] A nonparametric model for online topic discovery with word embeddings
    Chen, Junyang
    Gong, Zhiguo
    Liu, Weiwen
    INFORMATION SCIENCES, 2019, 504 : 32 - 47
  • [8] Topic Discovery for Short Texts Using Word Embeddings
    Xun, Guangxu
    Gopalakrishnan, Vishrawas
    Ma, Fenglong
    Li, Yaliang
    Gao, Jing
    Zhang, Aidong
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2016, : 1299 - 1304
  • [9] Comparison of the accuracy of Japanese synonym identifications using word embeddings in the radiological technology field
    Ayako Yagahara
    Noriya Yokohama
    Scientific Reports, 13
  • [10] Comparison of the accuracy of Japanese synonym identifications using word embeddings in the radiological technology field
    Yagahara, Ayako
    Yokohama, Noriya
    SCIENTIFIC REPORTS, 2023, 13 (01)