Text classification model based on multi-head attention capsule neworks

被引：0

作者：

Jia X. ^{[1
]}

Wang L. ^{[1
]}

机构：

[1] College of Data Science, Taiyuan University of Technology, Taiyuan

来源：

Qinghua Daxue Xuebao/Journal of Tsinghua University | 2020年 / 60卷 / 05期

关键词：

Capsule networks; Multi-head attention; Natural language processing; Text classification;

D O I：

10.16511/j.cnki.qhdxxb.2020.26.006

中图分类号：

学科分类号：

摘要：

The importance of each word in a text sequence and the dependencies between them have a significant impact on identifying the text categories. Capsule networks cannot selectively focus on important words in texts. Moreover, it is not possible to encode long-distance dependencies, therefore there are significant limitations in identifying texts with semantic transitions. In order to solve the above problems, this paper proposes a capsule networks based on multi-head attention, which can encode the dependencies between words, capture important words in texts, and encode the semantic of texts, thus effectively improve the effect of text classification task. The experimental results show that the model of this paper is better than the convolutional neural network and the capsule networks in the text classification task, it is more effective in the multi-label text classification task. In addition, it proves that this model can benefit better from the attention. © 2020, Tsinghua University Press. All right reserved.

引用

页码：415 / 421

页数：6

共 20 条

[1] Joachims T., Text categorization with suport vector machines: Learning with many relevant features, Proceedings of the 10th European Conference on Machine Learning, pp. 137-142, (1998)
[2] Mccallum A., Nigam K., A comparison of event models for naive bayes text classification, AAAI-98 Workshop on Learning for Text Categorization, pp. 41-48, (1998)
[3] Zhang W., Yoshida T., Tang X.J., TFIDF, LSI and multi-word in information retrieval and text categorization, Proceedings of 2008 IEEE International Conference on Systems, Man and Cybernetics, pp. 108-113, (2008)
[4] Lin C.Y., Hovy E., Automatic evaluation of summaries using N-gram co-occurrence statistics, Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, (2003)
[5] Genkin A., Lewis D.D., Madigan D., Large-scale Bayesian logistic regression for text categorization, Technometrics, 49, 3, pp. 291-304, (2007)
[6] Pang B., Lee L., Vaithyanathan S., Thumbs up? Sentiment classification using machine learning techniques, Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, pp. 79-86, (2002)
[7] Mikolov T., Sutskever I., Chen K., Et al., Distributed representations of words and phrases and their compositionality, Proceedings of the 26th International Conference on Neural Information Processing Systems, pp. 3111-3119, (2013)
[8] Pennington J., Socher R., Manning C.D., GloVe: Global vectors for word representation, Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532-1543, (2014)
[9] Kim Y., Convolutional neural networks for sentence classification, (2014)
[10] Conneau A., Schwenk H., Cun Y.L., Et al., Very deep convolutional networks for text classification, Proceedings of the 15th European Chapter of the Association for Computational Linguistics, pp. 1107-1116, (2017)

← 1 2 →