Multi-feature Fusion Based on Semantic Understanding Attention Neural Network for Chinese Text Categorization

被引:6
|
作者
Xie Jinbao [1 ]
Hou Yongjin [1 ]
Kang Shouqiang [1 ]
Li Baiwei [1 ]
Zhang Xiao [1 ]
机构
[1] Harbin Univ Sci & Technol, Sch Elect & Elect Engn, Harbin 150080, Heilongjiang, Peoples R China
关键词
Chinese text categorization; Multi-feature fusion; Attention algorithm; Long Short Term Memory (LSTM) network; Convolutional Neural Network (CNN);
D O I
10.11999/JEIT170815
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In Chinese text categorization tasks, the locations of the important features in the Chinese texts are disperse and sparse, and the different characteristics of Chinese texts contributes differently for the recognition of their categories. In order to solve the above problems, this paper proposes a multi-feature fusion model Three Convolutional neural network paths and Long short term memory path fused with Attention neural network path (3CLA) for Chinese text categorization, which is based on Convolutional Neural Network (CNN), Long Short Term Memory (LSTM) and semantic understanding attention neural networks. The model first uses text preprocessing to finish the segmentation and vectorization of the Chinese text. Then, through the embedding layer, the input data are sent to the CNN path, the LSTM path and the attention path respectively to extract text features of different levels and different characteristics. Finally, the text features are fused by the fusion layer and classified by the classifier. Based on the Chinese corpus, the text classification experiment is carried out. The results of the experiments show that compared with the CNN structure model and the LSTM structure model, the proposed algorithm model improves the recognition ability of Chinese text categories by up to about 8%.
引用
收藏
页码:1258 / 1265
页数:8
相关论文
共 15 条
  • [1] Chen JZ, 2016, PROCEEDINGS OF 2016 12TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), P551, DOI [10.1109/CIS.2016.0134, 10.1109/CIS.2016.133]
  • [2] [陈钊 Chen Zhao], 2015, [中文信息学报, Journal of Chinese Information Processing], V29, P172
  • [3] He X., 2016, P 2016 C EMP METH NA, P1598
  • [4] Kim Y., 2014, P 2014 C EMP METH NA
  • [5] LIANG Jun, 2015, J CHINESE INFORM PRO, V29, P160, DOI [10.3969/j.issn.1003-0077.2015.05.020, DOI 10.3969/J.ISSN.1003-0077.2015.05.020]
  • [6] LIU Feilong, 2017, COMPUTER SCI, V44, P92, DOI [10.11896/j.issn.1002-137X.2017.6A.019, DOI 10.11896/J.ISSN.1002-137X.2017.6A.019]
  • [7] [刘龙飞 Liu Longfei], 2015, [中文信息学报, Journal of Chinese Information Processing], V29, P159
  • [8] Mi Haitao, 2016, P 2016 C EMP METH NA, P2283, DOI DOI 10.18653/V1/D16-1249
  • [9] Miyamoto Y.., 2016, P 2016 C EMP METH NA, P1992, DOI DOI 10.18653/V1/D16-1209
  • [10] Parikh A., 2016, EMNLP, P2249