KEYWORD EXTRACTION BASED ON WORD SYNONYMS USING WORD2VEC

被引:2
|
作者
Ogul, Iskender Ulgen [1 ]
Ozcan, Caner [2 ]
Hakdagli, Ozlem [3 ]
机构
[1] Izmir Yuksek Teknol Enstitusu, Bilgisayar Muhendisligi, Izmir, Turkey
[2] Karabuk Univ, Bilgisayar Muhendisligi, Karabuk, Turkey
[3] Uludag Univ, Bilgisayar Muhendisligi, Bursa, Turkey
关键词
Spark; Word2Vec; Word Embedding; Keyword Extraction; Text Mining;
D O I
10.1109/siu.2019.8806496
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Nowadays, the data revealed by the online individuals are increasing exponentially. The raw information that increasing data holds, transformed into meaningful outputs using machine learning and deep learning methods. Generally, supervised learning methods are used for information extraction and classification. Supervised learning is based on the training set that classification algorithms are trained. In the proposed approach, keyword extraction solution is proposed to classify text data more convenient. The developed solution is based on the Word2Vec algorithm, which works by taking into consideration the semantic meaning of the words unlike general approaches that based on word frequency. A new approach, word embedding algorithm named "Word2Vec", works by calculating the word weights, semantic relationship, and the final weights of vectors. The obtained keywords are trained with Name Bayes and Decision Trees methods and the performance of the proposed method is shown by classification example.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Research on Keyword Extraction Based on Word2Vec Weighted TextRank
    Wen, Yujun
    Yuan, Hui
    Zhang, Pengzhou
    [J]. 2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2016, : 2109 - 2113
  • [2] Keywords Extraction Based on Word2Vec and TextRank
    Zhang, Yong
    Chen, Fen
    Zhang, Wufeng
    Zuo, Haoyang
    Yu, Fangyuan
    [J]. 2020 3RD INTERNATIONAL CONFERENCE ON BIG DATA AND EDUCATION (ICBDE 2020), 2020, : 37 - 42
  • [3] Stability of Word Embeddings Using Word2Vec
    Chugh, Mansi
    Whigham, Peter A.
    Dick, Grant
    [J]. AI 2018: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, 11320 : 812 - 818
  • [4] Arabic Text Keywords Extraction using Word2vec
    Suleiman, Dima
    Awajan, Arafat A.
    Al Etaiwi, Wael
    [J]. 2019 2ND INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS), 2019, : 251 - 257
  • [5] Word Semantic Similarity Calculation Based on Word2vec
    Jin, Xiaolin
    Zhang, Shuwu
    Liu, Jie
    [J]. 2018 INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES (ICCAIS), 2018, : 12 - 16
  • [6] Study on Tibetan Word Vector based on Word2vec
    Yang, Ning
    Li, Guanyu
    Ding, Hailan
    Gong, Chunwei
    [J]. 2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018), 2019, 1187
  • [7] Word Clustering based on Word2vec and Semantic Similarity
    Luo Jie
    Wang Qinglin
    Li Yuan
    [J]. 2014 33RD CHINESE CONTROL CONFERENCE (CCC), 2014, : 517 - 521
  • [8] SynoExtractor: A Novel Pipeline for Arabic Synonym Extraction Using Word2Vec Word Embeddings
    Al-Matham, Rawan N.
    Al-Khalifa, Hend S.
    [J]. COMPLEXITY, 2021, 2021
  • [9] Automatic Synonym Extraction Using Word2Vec and Spectral Clustering
    Zhang, Li
    Li, Jun
    Wang, Chao
    [J]. PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 5629 - 5632
  • [10] An Effective SNS event extraction model based on Word2vec
    Jang, Beakcheol
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2018, 123 : 45 - 45