A Combined-Convolutional Neural Network for Chinese News Text Classification

被引:0
|
作者
Zhang Y. [1 ,2 ]
Liu K.-F. [1 ]
Zhang Q.-X. [3 ]
Wang Y.-G. [1 ]
Gao K.-L. [1 ]
机构
[1] School of Electrical and Information Engineering, Beijing Key Laboratory of Intelligent Processing for Building Big Data, Beijing University of Civil Engineering and Architecture, Beijing
[2] State Key Laboratory in China for Geo Mechanics and Deep Underground Engineering, China University of Mining & Technology, Beijing
[3] School of Computer Science and Technology, Beijing Institute of Technology, Beijing
来源
关键词
Chinese news; Combined-convolutional neural network; Natural language processing; Text classification; Word vector;
D O I
10.12263/DZXB.20200134
中图分类号
学科分类号
摘要
At present, most of the researches on news classification are in English, and the traditional machine learning methods have a problem of incomplete extraction of local text block features in long text processing.In order to solve the problem of lack of special term set for Chinese news classification, a vocabulary suitable for Chinese text classification is made by constructing a data index method, and the text feature construction is combined with word2vec pre-trained word vector.In order to solve the problem of incomplete feature extraction, the effects of different convolution and pooling operations on the classification results are studied by improving the structure of classical convolution neural network model.In order to improve the precision of Chinese news text classification, this paper proposes and implements a combined-convolution neural network model, and designs an effective method of model regularization and optimization.The experimental results show that the precision of the combined-convolutional neural network model for Chinese news text classification reaches 93.69%, which is 6.34% and 1.19% higher than the best traditional machine learning method and classic convolutional neural network model, and it is better than the comparison model in recall and F-measure. © 2021, Chinese Institute of Electronics. All right reserved.
引用
收藏
页码:1059 / 1067
页数:8
相关论文
共 28 条
  • [1] Chung T, Xu B, Liu Y, Et al., Empirical study on character level neural network classifier for Chinese text, Engineering Applications of Artificial Intelligence, 80, 4, pp. 1-7, (2019)
  • [2] He J, Zou M, Liu P, Convolutional neural networks for Chinese sentiment classification of social network, 2017 IEEE International Conference on Mechatronics and Automation (ICMA), pp. 1877-1881, (2017)
  • [3] TANG Huan-ling, DOU Quan-sheng, YU Li-ping, Et al., SLDA-TC: A novel text categorization approach based on supervised topic model, Acta Electronica Sinica, 47, 6, pp. 1300-1308, (2019)
  • [4] ZHONG Jiang, ZHANG Shu-fen, GUO Wei-li, Et al., TFLA: A quality analysis framework for user generated contents, Acta Electronica Sinica, 46, 9, pp. 2201-2206, (2018)
  • [5] Yang Y, Nenkova A., Combining lexical and syntactic features for detecting content-dense texts in news, Journal of Artificial Intelligence Research, 60, 9, pp. 179-219, (2017)
  • [6] Wang Y, Li H, Wu Z., Attitude of the Chinese public toward off-site construction: A text mining study, Journal of Cleaner Production, 238, 11, (2019)
  • [7] Liu C, Wang X., Quality-relatedenglish text classification based on recurrent neural network, Journal of Visual Communication and Image Representation, 71, 8, (2019)
  • [8] LU Pin, JI Chun-lei, WANG Xin, Et al., Mass of short texts topical hierarchy mining integrated anchor extraction, Acta Electronica Sinica, 46, 5, pp. 1084-1088, (2018)
  • [9] Liao W, Wang Y, Yin Y, Et al., Improved sequence generation model for multi-label classification via CNN and initialized fully connection, Neurocomputing, 382, 3, pp. 88-195, (2020)
  • [10] Lu Z F, Ma G, Sun X W, Et al., Overview of the concept, classification and research status of AI security (I), Smart Power, 8, 47, pp. 32-42, (2019)