Research on Word Vector Training Method Based on Improved Skip-Gram Algorithm

被引:4
|
作者
Tang, Yachun [1 ]
机构
[1] Hunan Univ Sci & Engn, Coll Informat Engn, Yongzhou 425199, Peoples R China
基金
湖南省自然科学基金;
关键词
CLASSIFICATION;
D O I
10.1155/2022/4414207
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Through the effective word vector training method, we can obtain semantic-rich word vectors and can achieve better results on the same task. In view of the shortcomings of the traditional skip-gram model in coding and modeling the processing of context words, this study proposes an improved word vector-training method based on skip-gram algorithm. Based on the analysis of the existing skip-gram model, the concept of distribution hypothesis is introduced. The distribution of each word in the word context is taken as the representation of the word, the word is put into the semantic space of the word, and then the word is modelled, which is better modelled by the smoothing of words and the semantic space of words. In the training process, the random gradient descent method is used to solve the vector representation of each word and each Chinese character. The proposed training method is compared with skip gram, CWE+P, and SEING by using word sense similarity task and text classification task in the experiment. Experimental results showed that the proposed method had significant advantages in the Chinese-word segmentation task with a performance gain rate of about 30%. The method proposed in this study provides a reference for the in-depth study of word vector and text mining.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] An improved incremental Training algorithm of Support Vector Machines
    Qin, Liang
    Yin, Hongwei
    Shi, Xianjun
    Xiao, Zhicai
    ADVANCED MEASUREMENT AND TEST, PTS 1-3, 2011, 301-303 : 677 - +
  • [32] Research on Missile Avoidance Decision Training Based on Improved DDPG Algorithm
    Fan Xin-lei
    Zou Jie
    Wang Peng-fei
    Liu Kai
    SEVENTH SYMPOSIUM ON NOVEL PHOTOELECTRONIC DETECTION TECHNOLOGY AND APPLICATIONS, 2021, 11763
  • [33] Research on support vector machine optimization based on improved quantum genetic algorithm
    Wang, Fei
    Xie, Kunlun
    Han, Lin
    Han, Menghui
    Wang, Zeshi
    QUANTUM INFORMATION PROCESSING, 2023, 22 (10)
  • [34] Research on support vector machine optimization based on improved quantum genetic algorithm
    Fei Wang
    Kunlun Xie
    Lin Han
    Menghui Han
    Zeshi Wang
    Quantum Information Processing, 22
  • [35] An improved compound gradient vector based neural network on-line training algorithm
    Chen, ZP
    Dong, C
    Zhou, QQ
    Zhang, SJ
    DEVELOPMENTS IN APPLIED ARTIFICIAL INTELLIGENCE, 2003, 2718 : 316 - 325
  • [36] An Improved Training Algorithm of Support Vector Machines Based on Three Data Points Iteration
    Li Cunhe
    Liu Kangwei
    Zhu Lina
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, 2008, : 695 - 699
  • [37] Research on flame detection method based on improved SSD algorithm
    Zhan, Huawei
    Pei, Xinyu
    Zhang, Tianhao
    Zhang, Linqing
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (04) : 6501 - 6512
  • [38] Research on Resource Scheduling Method Based on Improved Hungary Algorithm
    Li, Tingpeng
    Li, Yue
    Qian, Yanling
    Li, Bin
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON LOGISTICS, ENGINEERING, MANAGEMENT AND COMPUTER SCIENCE (LEMCS 2015), 2015, 117 : 167 - 170
  • [39] Research on a Method of Fault Signal Extraction Based on Improved Algorithm
    Wang, Ying
    Li, Yourong
    Zhu, Xiaoqin
    Lin, Pan
    Luo, Yuesheng
    ADVANCED MATERIALS AND PROCESS TECHNOLOGY, PTS 1-3, 2012, 217-219 : 2692 - 2696
  • [40] A Malware Detection Method Based on Improved Fireworks Algorithm and Support Vector Machine
    Dong, Dawei
    Ye, Zhiwei
    Su, Jun
    Xie, Shiwei
    Cao, Yu
    Kochan, Roman
    15TH INTERNATIONAL CONFERENCE ON ADVANCED TRENDS IN RADIOELECTRONICS, TELECOMMUNICATIONS AND COMPUTER ENGINEERING (TCSET - 2020), 2020, : 846 - 851