Modeling multi-prototype Chinese word representation learning for word similarity

被引:2
|
作者
Yin, Fulian [1 ]
Wang, Yanyan [1 ]
Liu, Jianbo [1 ]
Tosato, Marco [2 ]
机构
[1] Commun Univ China, Inst Informat & Commun, Beijing 100024, Peoples R China
[2] York Univ, Lab Ind & Appl Math, Toronto, ON M3J 1P3, Canada
基金
中国国家自然科学基金;
关键词
Chinese word representation; Multi-prototype; Synonym knowledge base; Word semantic disambiguation; ONTOLOGY-BASED METHODS; EMBEDDINGS; SENTIMENT;
D O I
10.1007/s40747-021-00482-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The word similarity task is used to calculate the similarity of any pair of words, and is a basic technology of natural language processing (NLP). The existing method is based on word embedding, which fails to capture polysemy and is greatly influenced by the quality of the corpus. In this paper, we propose a multi-prototype Chinese word representation model (MP-CWR) for word similarity based on synonym knowledge base, including knowledge representation module and word similarity module. For the first module, we propose a dual attention to combine semantic information for jointly learning word knowledge representation. The MP-CWR model utilizes the synonyms as prior knowledge to supplement the relationship between words, which is helpful to solve the challenge of semantic expression due to insufficient data. As for the word similarity module, we propose a multi-prototype representation for each word. Then we calculate and fuse the conceptual similarity of two words to obtain the final result. Finally, we verify the effectiveness of our model on three public data sets with other baseline models. In addition, the experiments also prove the stability and scalability of our MP-CWR model under different corpora.
引用
收藏
页码:2977 / 2990
页数:14
相关论文
共 50 条
  • [21] Learning Chinese word representation better by cascade morphologicaln-gram
    Xiong, Zongyang
    Qin, Ke
    Yang, Haobo
    Luo, Guangchun
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (08): : 3757 - 3768
  • [22] An approach based on tongyici cilin and word similarity for Chinese word sense induction
    Sun, Rui
    Jin, Peng
    Yang, Xia
    ICIC Express Letters, 2013, 7 (06): : 1767 - 1772
  • [23] Adversarial Multi-Criteria Learning for Chinese Word Segmentation
    Chen, Xinchi
    Shi, Zhan
    Qiu, Xipeng
    Huang, Xuanjing
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1193 - 1203
  • [24] Convolution-deconvolution word embedding: An end-to-end multi-prototype fusion embedding method for natural language processing
    Shuang, Kai
    Zhang, Zhixuan
    Loo, Jonathan
    Su, Sen
    INFORMATION FUSION, 2020, 53 : 112 - 122
  • [25] Improved Word Representation Learning with Sememes
    Niu, Yilin
    Xie, Ruobing
    Liu, Zhiyuan
    Sun, Maosong
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 2049 - 2058
  • [26] Learning Multi-Modal Word Representation Grounded in Visual Context
    Zablocki, Eloi
    Piwowarski, Benjamin
    Soulier, Laure
    Gallinari, Patrick
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 5626 - 5633
  • [27] Learning distributed word representation with multi-contextual mixed embedding
    Li, Jianqiang
    Li, Jing
    Fu, Xianghua
    Masud, M. A.
    Huang, Joshua Zhexue
    KNOWLEDGE-BASED SYSTEMS, 2016, 106 : 220 - 230
  • [28] Bridging Text and Knowledge by Learning Multi-Prototype Entity Mention Embedding
    Cao, Yixin
    Huang, Lifu
    Ji, Heng
    Chen, Xu
    Li, Juanzi
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1623 - 1633
  • [29] Improving Chinese Word Representation with Conceptual Semantics
    Wei, Tingxin
    Qu, Weiguang
    Zhou, Junsheng
    Long, Yunfei
    Gu, Yanhui
    Xia, Zhentao
    CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 64 (03): : 1897 - 1913
  • [30] Phonological similarity in multi-word units
    Gries, Stefan Th.
    COGNITIVE LINGUISTICS, 2011, 22 (03) : 491 - 510