Automatic Synonym Extraction Using Word2Vec and Spectral Clustering

被引:0
|
作者
Zhang, Li [1 ]
Li, Jun [1 ]
Wang, Chao [1 ]
机构
[1] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei 230027, Anhui, Peoples R China
关键词
synonym; extraction; word2vec; semantic similarity; spectral clustering;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Synonyms extraction is a fundamental research, which is helpful to text mining and information retrieval. In this paper, we propose method to extract synonymy from text, the method employs spectral clustering and word2vec. First, the word2vec model is trained by a large-scale English Wikipedia corpus. Then, we extract keywords from a text and use the trained model to generate similarities among these keywords. Since the word2vec model maps the relations of terms into a semantic space, the similarity of the terms is given by cosine similarity of the vectors. We construct the graph of these terms and its adjacency matrix. Finally, spectral clustering is used to cluster similar words. The experiment results show that this method has higher accuracy and recall scores compared with K-means.
引用
收藏
页码:5629 / 5632
页数:4
相关论文
共 50 条
  • [1] SynoExtractor: A Novel Pipeline for Arabic Synonym Extraction Using Word2Vec Word Embeddings
    Al-Matham, Rawan N.
    Al-Khalifa, Hend S.
    [J]. COMPLEXITY, 2021, 2021
  • [2] The Spectral Underpinning of word2vec
    Jaffe, Ariel
    Kluger, Yuval
    Lindenbaum, Ofir
    Patsenker, Jonathan
    Peterfreund, Erez
    Steinerberger, Stefan
    [J]. FRONTIERS IN APPLIED MATHEMATICS AND STATISTICS, 2020, 6
  • [3] KEYWORD EXTRACTION BASED ON WORD SYNONYMS USING WORD2VEC
    Ogul, Iskender Ulgen
    Ozcan, Caner
    Hakdagli, Ozlem
    [J]. 2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,
  • [4] Arabic Text Keywords Extraction using Word2vec
    Suleiman, Dima
    Awajan, Arafat A.
    Al Etaiwi, Wael
    [J]. 2019 2ND INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS), 2019, : 251 - 257
  • [5] Word Clustering based on Word2vec and Semantic Similarity
    Luo Jie
    Wang Qinglin
    Li Yuan
    [J]. 2014 33RD CHINESE CONTROL CONFERENCE (CCC), 2014, : 517 - 521
  • [6] Automatic Text Summarization Using Gensim Word2Vec and K-Means Clustering Algorithm
    Haider, Mofiz Mojib
    Hossin, Md Arman
    Mahi, Hasibur Rashid
    Arif, Hossain
    [J]. 2020 IEEE REGION 10 SYMPOSIUM (TENSYMP) - TECHNOLOGY FOR IMPACTFUL SUSTAINABLE DEVELOPMENT, 2020, : 283 - 286
  • [7] Stability of Word Embeddings Using Word2Vec
    Chugh, Mansi
    Whigham, Peter A.
    Dick, Grant
    [J]. AI 2018: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, 11320 : 812 - 818
  • [8] Keywords Extraction Based on Word2Vec and TextRank
    Zhang, Yong
    Chen, Fen
    Zhang, Wufeng
    Zuo, Haoyang
    Yu, Fangyuan
    [J]. 2020 3RD INTERNATIONAL CONFERENCE ON BIG DATA AND EDUCATION (ICBDE 2020), 2020, : 37 - 42
  • [9] Acceleration of Word2vec Using GPUs
    Bae, Seulki
    Yi, Youngmin
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2016, PT II, 2016, 9948 : 269 - 279
  • [10] Clustering of banned food additives based on Word2vec
    Zhang, Yipeng
    Li, Xiaoli
    Wang, Kang
    Li, Yang
    [J]. PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 3467 - 3471