GNEG: Graph-Based Negative Sampling for word2vec

被引:0
|
作者
Zhang, Zheng [1 ,2 ]
Zweigenbaum, Pierre [1 ]
机构
[1] Univ Paris Saclay, CNRS, LIMSI, Orsay, France
[2] Univ Paris Saclay, CNRS, Univ Paris Sud, LRI, Orsay, France
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Negative sampling is an important component in word2vec for distributed word representation learning. We hypothesize that taking into account global, corpus-level information and generating a different noise distribution for each target word better satisfies the requirements of negative examples for each training word than the original frequency-based distribution. In this purpose we pre-compute word co-occurrence statistics from the corpus and apply to it network algorithms such as random walk. We test this hypothesis through a set of experiments whose results show that our approach boosts the word analogy task by about 5% and improves the performance on word similarity tasks by about 1% compared to the skip-gram negative sampling baseline.
引用
收藏
页码:566 / 571
页数:6
相关论文
共 50 条
  • [41] GraphTar: applying word2vec and graph neural networks to miRNA target prediction
    Przybyszewski, Jan
    Malawski, Maciej
    Licholai, Sabina
    [J]. BMC BIOINFORMATICS, 2023, 24 (01)
  • [42] word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data
    Grohe, Martin
    [J]. PODS'20: PROCEEDINGS OF THE 39TH ACM SIGMOD-SIGACT-SIGAI SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2020, : 1 - 16
  • [43] Research on Keyword Extraction Based on Word2Vec Weighted TextRank
    Wen, Yujun
    Yuan, Hui
    Zhang, Pengzhou
    [J]. 2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2016, : 2109 - 2113
  • [44] Design and Implementation of Word2Vec Parallel Algorithm Based on HPC
    Yi, Xianyong
    Zheng, Rongge
    Wang, Aoyu
    Qin, Hao
    Chen, Yufeng
    [J]. 2017 CHINESE AUTOMATION CONGRESS (CAC), 2017, : 585 - 590
  • [45] Scenario-Based Microservice Retrieval Using Word2Vec
    Ma, Shang-Pin
    Chuang, Yen
    Lan, Ci-Wei
    Chen, Hsi-Min
    Huang, Chun-Ying
    Li, Chia-Yu
    [J]. 2018 IEEE 15TH INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING (ICEBE 2018), 2018, : 239 - 244
  • [46] Word2vec Based System for Recognizing Partial Textual Entailment
    Vita, Martin
    Kriz, Vincent
    [J]. PROCEEDINGS OF THE 2016 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2016, 8 : 513 - 516
  • [47] Text Classification Based on Word2vec and Convolutional Neural Network
    Li, Lin
    Xiao, Linlong
    Jin, Wenzhen
    Zhu, Hong
    Yang, Guocai
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2018), PT V, 2018, 11305 : 450 - 460
  • [48] Word2Vec based Spelling Correction Method of Twitter Message
    Kim, Jeongin
    Hong, Taekeun
    Kim, Pankoo
    [J]. SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, : 2016 - 2019
  • [49] Affective Analysis of Chinese Sentences Based on Word2vec and SVC
    Wan, Fu-yong
    Li, Shi-qiang
    [J]. 2018 INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATIONS AND MECHATRONICS ENGINEERING (CCME 2018), 2018, 332 : 701 - 708
  • [50] The new deep learning architecture based on GRU and word2vec
    Atassi, Abdelhamid
    El Azami, Ikram
    Sadiq, Abdelalim
    [J]. 2018 INTERNATIONAL CONFERENCE ON ELECTRONICS, CONTROL, OPTIMIZATION AND COMPUTER SCIENCE (ICECOCS), 2018,