Improving Word Representation by Tuning Word2Vec Parameters with Deep Learning Model

被引：0

作者：

Tezgider, Murat ^{[1
]}

Yildiz, Beytullah ^{[2
]}

Aydin, Galip ^{[3
]}

机构：

[1] Hacettepe Univ, Ankara, Turkey

[2] TC Cumhurbaskanligi, Bilgi Teknol Baskanligi, Ankara, Turkey

[3] Firat Univ, Bilgisayar Muhendisligi Bolumu, Elazig, Turkey

来源：

2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP) | 2018年

关键词：

Deep learning; text processing; text analysis; word representation; Word2Vec;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep learning has become one of the most popular machine learning methods. The success in the text processing, analysis and classification has been significantly enhanced by using deep learning. This success is contributed by the quality of the word representations. TFIDF, FastText, Glove and Word2Vec are used for the word representation. In this work, we aimed to improve word representations by tuning Word2Vec parameters. The success of the word representations was measured by using a deep learning classification model. The minimum word count, vector size and window size parameters of Word2Vec were used for the measurement. 2,8 million Turkish texts consisting of 243 million words to create word embedding (word representations) and around 263 thousand documents consisting of 15 different classes for classification were used. We observed that correctly selected parameters increased the word representation quality and thus the accuracy of classification.

引用

页数：7

共 50 条

[1] Representation of Semantic Word Embeddings Based on SLDA and Word2vec Model
Tang Huanling
Zhu Hui
Wei Hongmin
Zheng Han
Mao Xueli
Lu Mingyu
Guo Jin
[J]. CHINESE JOURNAL OF ELECTRONICS, 2023, 32 (03) : 647 - 654
[2] Representation of Semantic Word Embeddings Based on SLDA and Word2vec Model
TANG Huanling
ZHU Hui
WEI Hongmin
ZHENG Han
MAO Xueli
LU Mingyu
GUO Jin
[J]. Chinese Journal of Electronics, 2023, 32 (03) : 647 - 654
[3] Word2vec's Distributed Word Representation for Hindi Word Sense Disambiguation
Kumari, Archana
Lobiyal, D. K.
[J]. DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY (ICDCIT 2020), 2020, 11969 : 325 - 335
[4] Considerations about learning Word2Vec
Giovanni Di Gennaro
Amedeo Buonanno
Francesco A. N. Palmieri
[J]. The Journal of Supercomputing, 2021, 77 : 12320 - 12335
[5] Considerations about learning Word2Vec
Di Gennaro, Giovanni
Buonanno, Amedeo
Palmieri, Francesco A. N.
[J]. JOURNAL OF SUPERCOMPUTING, 2021, 77 (11): : 12320 - 12335
[6] PTPD: predicting therapeutic peptides by deep learning and word2vec
Wu, Chuanyan
Gao, Rui
Zhang, Yusen
De Marinis, Yang
[J]. BMC BIOINFORMATICS, 2019, 20 (01)
[7] The new deep learning architecture based on GRU and word2vec
Atassi, Abdelhamid
El Azami, Ikram
Sadiq, Abdelalim
[J]. 2018 INTERNATIONAL CONFERENCE ON ELECTRONICS, CONTROL, OPTIMIZATION AND COMPUTER SCIENCE (ICECOCS), 2018,
[8] Classification Turkish SMS with Deep Learning Tool Word2Vec
Karasoy, Onur
Balli, Serkan
[J]. 2017 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2017, : 294 - 297
[9] PTPD: predicting therapeutic peptides by deep learning and word2vec
Chuanyan Wu
Rui Gao
Yusen Zhang
Yang De Marinis
[J]. BMC Bioinformatics, 20
[10] Using Part of Speech Tagging for Improving Word2vec Model
Suleiman, Dima
Awajan, Arafat A.
[J]. 2019 2ND INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS), 2019, : 213 - 219

← 1 2 3 4 5 →