Improving Word Representation by Tuning Word2Vec Parameters with Deep Learning Model

被引:0
|
作者
Tezgider, Murat [1 ]
Yildiz, Beytullah [2 ]
Aydin, Galip [3 ]
机构
[1] Hacettepe Univ, Ankara, Turkey
[2] TC Cumhurbaskanligi, Bilgi Teknol Baskanligi, Ankara, Turkey
[3] Firat Univ, Bilgisayar Muhendisligi Bolumu, Elazig, Turkey
关键词
Deep learning; text processing; text analysis; word representation; Word2Vec;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning has become one of the most popular machine learning methods. The success in the text processing, analysis and classification has been significantly enhanced by using deep learning. This success is contributed by the quality of the word representations. TFIDF, FastText, Glove and Word2Vec are used for the word representation. In this work, we aimed to improve word representations by tuning Word2Vec parameters. The success of the word representations was measured by using a deep learning classification model. The minimum word count, vector size and window size parameters of Word2Vec were used for the measurement. 2,8 million Turkish texts consisting of 243 million words to create word embedding (word representations) and around 263 thousand documents consisting of 15 different classes for classification were used. We observed that correctly selected parameters increased the word representation quality and thus the accuracy of classification.
引用
收藏
页数:7
相关论文
共 50 条
  • [31] IVS2vec: A tool of Inverse Virtual Screening based on word2vec and deep learning techniques
    Zhang, Haiping
    Liao, Linbu
    Cai, Yunting
    Hu, Yuhui
    Wang, Hao
    [J]. METHODS, 2019, 166 : 57 - 65
  • [32] Sentimental Short Sentences Classification by Using CNN Deep Learning Model with Fine Tuned Word2Vec
    Sharma, Amit Kumar
    Chaurasia, Sandeep
    Srivastava, Devesh Kumar
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND DATA SCIENCE, 2020, 167 : 1139 - 1147
  • [33] Scaling Word2Vec on Big Corpus
    Bofang Li
    Aleksandr Drozd
    Yuhe Guo
    Tao Liu
    Satoshi Matsuoka
    Xiaoyong Du
    [J]. Data Science and Engineering, 2019, 4 : 157 - 175
  • [34] Scaling Word2Vec on Big Corpus
    Li, Bofang
    Drozd, Aleksandr
    Guo, Yuhe
    Liu, Tao
    Matsuoka, Satoshi
    Du, Xiaoyong
    [J]. DATA SCIENCE AND ENGINEERING, 2019, 4 (02) : 157 - 175
  • [35] Application of Word2vec in Phoneme Recognition
    Feng, Xin
    Wang, Lei
    [J]. ICMLC 2020: 2020 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2018, : 495 - 499
  • [36] Acceleration of Word2vec Using GPUs
    Bae, Seulki
    Yi, Youngmin
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2016, PT II, 2016, 9948 : 269 - 279
  • [37] Application of Output Embedding on Word2Vec
    Uchida, Shuto
    Yoshikawa, Tomohiro
    Furuhashi, Takeshi
    [J]. 2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2018, : 1433 - 1436
  • [38] KEYWORD EXTRACTION BASED ON WORD SYNONYMS USING WORD2VEC
    Ogul, Iskender Ulgen
    Ozcan, Caner
    Hakdagli, Ozlem
    [J]. 2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,
  • [39] A detailed review on word embedding techniques with emphasis on word2vec
    S. Joshua Johnson
    M. Ramakrishna Murty
    I. Navakanth
    [J]. Multimedia Tools and Applications, 2024, 83 : 37979 - 38007
  • [40] A detailed review on word embedding techniques with emphasis on word2vec
    Johnson, S. Joshua
    Murty, M. Ramakrishna
    Navakanth, I.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (13) : 37979 - 38007