KEYWORD EXTRACTION BASED ON WORD SYNONYMS USING WORD2VEC

被引:2
|
作者
Ogul, Iskender Ulgen [1 ]
Ozcan, Caner [2 ]
Hakdagli, Ozlem [3 ]
机构
[1] Izmir Yuksek Teknol Enstitusu, Bilgisayar Muhendisligi, Izmir, Turkey
[2] Karabuk Univ, Bilgisayar Muhendisligi, Karabuk, Turkey
[3] Uludag Univ, Bilgisayar Muhendisligi, Bursa, Turkey
关键词
Spark; Word2Vec; Word Embedding; Keyword Extraction; Text Mining;
D O I
10.1109/siu.2019.8806496
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Nowadays, the data revealed by the online individuals are increasing exponentially. The raw information that increasing data holds, transformed into meaningful outputs using machine learning and deep learning methods. Generally, supervised learning methods are used for information extraction and classification. Supervised learning is based on the training set that classification algorithms are trained. In the proposed approach, keyword extraction solution is proposed to classify text data more convenient. The developed solution is based on the Word2Vec algorithm, which works by taking into consideration the semantic meaning of the words unlike general approaches that based on word frequency. A new approach, word embedding algorithm named "Word2Vec", works by calculating the word weights, semantic relationship, and the final weights of vectors. The obtained keywords are trained with Name Bayes and Decision Trees methods and the performance of the proposed method is shown by classification example.
引用
收藏
页数:4
相关论文
共 50 条
  • [21] FPGA-based Acceleration of Word2vec Using OpenCL
    Ono, Taisuke
    Shoji, Tomoki
    Waidyasooriya, Hasitha Muthumala
    Hariyama, Masanori
    Aoki, Yuichiro
    Kondoh, Yuki
    Nakagawa, Yaoko
    [J]. 2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [22] Movie Recommendation using Metadata based Word2Vec Algorithm
    Yoon, Yeo Chan
    Lee, Jun Woo
    [J]. 2018 INTERNATIONAL CONFERENCE ON PLATFORM TECHNOLOGY AND SERVICE (PLATCON18), 2018, : 33 - 37
  • [23] Representation of Semantic Word Embeddings Based on SLDA and Word2vec Model
    TANG Huanling
    ZHU Hui
    WEI Hongmin
    ZHENG Han
    MAO Xueli
    LU Mingyu
    GUO Jin
    [J]. Chinese Journal of Electronics, 2023, 32 (03) : 647 - 654
  • [24] The Improved Model for word2vec Based on Part of Speech and Word Order
    Pan, Bo
    Yu, Chong-Chong
    Zhang, Qing-Chuan
    Xu, Shi-Xuan
    Cao, Shuai
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2018, 46 (08): : 1976 - 1982
  • [25] SENTI2VEC: AN EFFECTIVE FEATURE EXTRACTION TECHNIQUE FOR SENTIMENT ANALYSIS BASED ON WORD2VEC
    Alshari, Eissa M.
    Azman, Azreen
    Doraisamy, Shyamala
    Mustapha, Norwati
    Alksher, Mostafa
    [J]. MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2020, 33 (03) : 240 - 251
  • [26] The enhancement of TextRank algorithm by using word2vec and its application on topic extraction
    Zuo, Xiaolei
    Zhang, Silan
    Xia, Jingbo
    [J]. 2ND ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION SYSTEM AND ARTIFICIAL INTELLIGENCE (ISAI2017), 2017, 887
  • [27] Matching Transportation Ontologies with Word2Vec and Alignment Extraction Algorithm
    Xue, Xingsi
    Wang, Haolin
    Zhang, Jie
    Huang, Yikun
    Li, Mengting
    Zhu, Hai
    [J]. JOURNAL OF ADVANCED TRANSPORTATION, 2021, 2021
  • [28] Word Sense Disambiguation Using Cosine Similarity Collaborates with Word2vec and WordNet
    Orkphol, Korawit
    Yang, Wu
    [J]. FUTURE INTERNET, 2019, 11 (05):
  • [29] Scaling Word2Vec on Big Corpus
    Bofang Li
    Aleksandr Drozd
    Yuhe Guo
    Tao Liu
    Satoshi Matsuoka
    Xiaoyong Du
    [J]. Data Science and Engineering, 2019, 4 : 157 - 175
  • [30] Scaling Word2Vec on Big Corpus
    Li, Bofang
    Drozd, Aleksandr
    Guo, Yuhe
    Liu, Tao
    Matsuoka, Satoshi
    Du, Xiaoyong
    [J]. DATA SCIENCE AND ENGINEERING, 2019, 4 (02) : 157 - 175