Research on Word Vector Training Method Based on Improved Skip-Gram Algorithm

被引:4
|
作者
Tang, Yachun [1 ]
机构
[1] Hunan Univ Sci & Engn, Coll Informat Engn, Yongzhou 425199, Peoples R China
基金
湖南省自然科学基金;
关键词
CLASSIFICATION;
D O I
10.1155/2022/4414207
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Through the effective word vector training method, we can obtain semantic-rich word vectors and can achieve better results on the same task. In view of the shortcomings of the traditional skip-gram model in coding and modeling the processing of context words, this study proposes an improved word vector-training method based on skip-gram algorithm. Based on the analysis of the existing skip-gram model, the concept of distribution hypothesis is introduced. The distribution of each word in the word context is taken as the representation of the word, the word is put into the semantic space of the word, and then the word is modelled, which is better modelled by the smoothing of words and the semantic space of words. In the training process, the random gradient descent method is used to solve the vector representation of each word and each Chinese character. The proposed training method is compared with skip gram, CWE+P, and SEING by using word sense similarity task and text classification task in the experiment. Experimental results showed that the proposed method had significant advantages in the Chinese-word segmentation task with a performance gain rate of about 30%. The method proposed in this study provides a reference for the in-depth study of word vector and text mining.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] An Improved Training Algorithm for the Linear Ranking Support Vector Machine
    Airola, Antti
    Pahikkala, Tapio
    Salakoski, Tapio
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2011, PT I, 2011, 6791 : 134 - +
  • [42] Improved Eigenspace algorithm based on vector hydrophone
    Hui J.
    Guo J.
    Song M.
    Zhang X.
    Li J.
    Tang K.
    Zhao A.
    Harbin Gongcheng Daxue Xuebao/Journal of Harbin Engineering University, 2020, 41 (10): : 1471 - 1476and1552
  • [43] An Improved kNN Algorithm based on Essential Vector
    Zhao, Weidong
    Tang, Shuanglin
    Dai, Weihui
    ELEKTRONIKA IR ELEKTROTECHNIKA, 2012, 123 (07) : 119 - 122
  • [44] Research of Knowledge Mapping Construction Based on Word Vector
    Ji, Li
    Lin, Guangyan
    Zhang, Yiqiong
    2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD), 2018, : 190 - 194
  • [45] An Improved Algorithm of Orthogonal Vector Spectral Estimation Method
    Huang, Dengshan
    Liu, Xingzhao
    Ren, Jie
    PIERS 2010 XI'AN: PROGRESS IN ELECTROMAGNETICS RESEARCH SYMPOSIUM PROCEEDINGS, VOLS 1 AND 2, 2010, : 460 - 463
  • [46] Research on Fault Diagnosis Method Using Improved MultiClass Classification Algorithm and Relevance Vector Machine
    Wu, Kun
    Kang, Jianshe
    Chi, Kuo
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND WEB ENGINEERING, 2015, 10 (03) : 1 - 16
  • [47] Research on skip entry guidance algorithm based on model predictive control
    Wang, Feng-Bo
    Dong, Chang-Hong
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2014, 36 (10): : 2029 - 2036
  • [48] Research on Global Dynamic Path Planning Method Based on Improved A ∗ Algorithm
    Niu, Chuanhu
    Li, Aijuan
    Huang, Xin
    Li, Wei
    Xu, Chuanyan
    Mathematical Problems in Engineering, 2021, 2021
  • [49] Research on an Ultraviolet Spectral Denoising Algorithm Based on the Improved SVD Method
    Qin, Zhaoyu
    Wang, Zhaofan
    Wang, Ruxing
    APPLIED SCIENCES-BASEL, 2023, 13 (22):
  • [50] Research on Global Dynamic Path Planning Method Based on Improved A* Algorithm
    Niu, Chuanhu
    Li, Aijuan
    Huang, Xin
    Li, Wei
    Xu, Chuanyan
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021