Improving Word Representation by Tuning Word2Vec Parameters with Deep Learning Model

被引:0
|
作者
Tezgider, Murat [1 ]
Yildiz, Beytullah [2 ]
Aydin, Galip [3 ]
机构
[1] Hacettepe Univ, Ankara, Turkey
[2] TC Cumhurbaskanligi, Bilgi Teknol Baskanligi, Ankara, Turkey
[3] Firat Univ, Bilgisayar Muhendisligi Bolumu, Elazig, Turkey
关键词
Deep learning; text processing; text analysis; word representation; Word2Vec;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning has become one of the most popular machine learning methods. The success in the text processing, analysis and classification has been significantly enhanced by using deep learning. This success is contributed by the quality of the word representations. TFIDF, FastText, Glove and Word2Vec are used for the word representation. In this work, we aimed to improve word representations by tuning Word2Vec parameters. The success of the word representations was measured by using a deep learning classification model. The minimum word count, vector size and window size parameters of Word2Vec were used for the measurement. 2,8 million Turkish texts consisting of 243 million words to create word embedding (word representations) and around 263 thousand documents consisting of 15 different classes for classification were used. We observed that correctly selected parameters increased the word representation quality and thus the accuracy of classification.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] The Improved Model for word2vec Based on Part of Speech and Word Order
    Pan, Bo
    Yu, Chong-Chong
    Zhang, Qing-Chuan
    Xu, Shi-Xuan
    Cao, Shuai
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2018, 46 (08): : 1976 - 1982
  • [22] The Spectral Underpinning of word2vec
    Jaffe, Ariel
    Kluger, Yuval
    Lindenbaum, Ofir
    Patsenker, Jonathan
    Peterfreund, Erez
    Steinerberger, Stefan
    [J]. FRONTIERS IN APPLIED MATHEMATICS AND STATISTICS, 2020, 6
  • [23] Emerging Trends Word2Vec
    Church, Kenneth Ward
    [J]. NATURAL LANGUAGE ENGINEERING, 2017, 23 (01) : 155 - 162
  • [24] Sentiment Analysis of Algerian Dialect Using Machine Learning and Deep Learning with Word2vec
    Mazari, Ahmed Cherif
    Djeffal, Abdelhamid
    [J]. INFORMATICA-AN INTERNATIONAL JOURNAL OF COMPUTING AND INFORMATICS, 2022, 46 (06): : 67 - 78
  • [25] A deep learning analysis on question classification task using Word2vec representations
    Yilmaz, Seyhmus
    Toklu, Sinan
    [J]. NEURAL COMPUTING & APPLICATIONS, 2020, 32 (07): : 2909 - 2928
  • [26] A deep learning analysis on question classification task using Word2vec representations
    Seyhmus Yilmaz
    Sinan Toklu
    [J]. Neural Computing and Applications, 2020, 32 : 2909 - 2928
  • [27] Word Semantic Similarity Calculation Based on Word2vec
    Jin, Xiaolin
    Zhang, Shuwu
    Liu, Jie
    [J]. 2018 INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES (ICCAIS), 2018, : 12 - 16
  • [28] Deep Learning Framework bused on Word2Vec and CNN for Users Interests Classification
    Ombabi, Abubakr H.
    Lazzez, Onsa
    Ouarda, Wael
    Alimi, Adel M.
    [J]. PROCEEDINGS OF 2017 SUDAN CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (SCCSIT), 2017, : 36 - 42
  • [29] Study on Tibetan Word Vector based on Word2vec
    Yang, Ning
    Li, Guanyu
    Ding, Hailan
    Gong, Chunwei
    [J]. 2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018), 2019, 1187
  • [30] Word Clustering based on Word2vec and Semantic Similarity
    Luo Jie
    Wang Qinglin
    Li Yuan
    [J]. 2014 33RD CHINESE CONTROL CONFERENCE (CCC), 2014, : 517 - 521