Creation of Necessary Technical and Expert-Analytical Conditions for Development of the Information System of Evaluating Open Text Information Sources' Influence on Society

被引:0
|
作者
Mussabayev, Rustam [1 ]
Kassymzhanov, Bek [1 ]
Mukashev, Aidos [1 ]
Ibrayeva, Viktoriya [1 ]
Merkebayev, Azat [1 ]
机构
[1] MES RK IIVT, Lab Anal & Modeling Informat Proc, Inst Informat & Comp Technol Comm Sci, Alma Ata, Kazakhstan
关键词
distributional models; text preprocessing; artificial intelligence; machine learning; WORD2VEC and GLOVE;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we trained distributional models (patterns) for text preprocessing in Word2vec and Glove. Three variants of text preprocessing were used to train distributional patterns. Based on the implemented distribution model Word2Vec, a vector representation was obtained for a cluster-separated test sample of 30 news items. All variants of the weighted average calculation of the vector representation of texts were considered. Two-stage clustering was carried out. After training the Doc2Vec model on normalized documents, a vector representation was obtained for each document. The following news about the same event was selected for the test, but from different sources. A 2-dimensional "factual cube" was analyzed.
引用
收藏
页码:104 / 109
页数:6
相关论文
共 5 条