Analysis of The Characteristics of Similar Words Computed by Word Embeddings

被引:0
|
作者
Zhou, Shuhui [1 ]
Liu, Peihan [2 ]
Liu, Lizhen [1 ]
Song, Wei [1 ]
Cheng, Miaomiao [1 ]
机构
[1] Capital Normal Univ, Informat & Engn Coll, Beijing 100048, Peoples R China
[2] Shanghai Ocean Univ, AIEN Coll, Shanghai 201306, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
word similarity; word2vec; classification;
D O I
10.1109/iceiec49280.2020.9152307
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Word2vec is a popular word embedding technique and has also gained a lot of attention in the NLP field. But word embedding based on distributed representation is deficient in the semantics of distribution. This defect often occurs when we use word similarity to find similar words of a seed word. This article analyzes these similar words based on this deficiency. We propose a novel classification criterion to effectively classify similar words into 7 categories. Finally, we listed the future research directions, hoping to solve the problem of word confusion effectively.
引用
收藏
页码:327 / 330
页数:4
相关论文
共 50 条
  • [1] More than Bags of Words: Sentiment Analysis with Word Embeddings
    Rudkowsky, Elena
    Haselmayer, Martin
    Wastian, Matthias
    Jenny, Marcelo
    Emrich, Stefan
    Sedlmair, Michael
    [J]. COMMUNICATION METHODS AND MEASURES, 2018, 12 (2-3) : 140 - 157
  • [2] On Approximately Searching for Similar Word Embeddings
    Sugawara, Kohei
    Kobayashi, Hayato
    Iwasaki, Masajiro
    [J]. PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 2265 - 2275
  • [3] Word Embeddings of Monosemous Words in Dictionary for Word Sense Disambiguation
    Sasaki, Minoru
    [J]. SEMAPRO 2018: THE TWELFTH INTERNATIONAL CONFERENCE ON ADVANCES IN SEMANTIC PROCESSING, 2018, : 4 - 7
  • [4] Word-for-word glossing with contextually similar words
    Pantel, P
    Lin, D
    [J]. 6TH APPLIED NATURAL LANGUAGE PROCESSING CONFERENCE/1ST MEETING OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE AND PROCEEDINGS OF THE ANLP-NAACL 2000 STUDENT RESEARCH WORKSHOP, 2000, : A78 - A85
  • [5] Exploring the Effect of Word Embeddings and Bag-of-Words for Vietnamese Sentiment Analysis
    Pham, Duc-Hong
    [J]. UBIQUITOUS INTELLIGENT SYSTEMS, 2022, 302 : 595 - 605
  • [6] Generating Bags of Words from the Sums of Their Word Embeddings
    White, Lyndon
    Togneri, Roberto
    Liu, Wei
    Bennamoun, Mohammed
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, (CICLING 2016), PT I, 2018, 9623 : 91 - 102
  • [7] Improving Word Embeddings for Low Frequency Words by Pseudo Contexts
    Li, Fang
    Wang, Xiaojie
    [J]. CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2017, 2017, 10565 : 37 - 47
  • [8] New Word Analogy Corpus for Exploring Embeddings of Czech Words
    Svoboda, Lukas
    Brychcin, Tomas
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, (CICLING 2016), PT I, 2018, 9623 : 103 - 114
  • [9] Beyond Word Embeddings: Temporal Representations of Words using Google Trends
    Haque, Md Enamul
    Maiti, Aniruddha
    Tozal, Mehmet Engin
    [J]. 2021 IEEE 15TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2021), 2021, : 280 - 287
  • [10] Learning Chinese Word Embeddings With Words and Subcharacter N-Grams
    Kang, Ruizhi
    Zhang, Hongjun
    Hao, Wenning
    Cheng, Kai
    Zhang, Guanglu
    [J]. IEEE ACCESS, 2019, 7 : 42987 - 42992