An Efficient Character-Level and Word-Level Feature Fusion Method for Chinese Text Classification

被引:8
|
作者
Jin Wenzhen [1 ]
Zhu Hong [1 ]
Yang Guocai [1 ]
机构
[1] Southwest Univ, Coll Comp & Informat Sci, Chongqing, Peoples R China
关键词
D O I
10.1088/1742-6596/1229/1/012057
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In order to extract semantic feature information between texts more efficiently and reduce the effect of text representation on classification results, we propose a features fusion model C_BiGRU_ATT based on deep learning. The core task of our model is to extract the context information and local information of the text using Convolutional Neural Network(CNN) and Attention-based Bidirectional Gated Recurrent Unit(BiGRU) at character-level and word-level. Our experimental results show that the classification accuracies of C_BiGRU_ATT reach 95.55% and 95.60% on two Chinese datasets THUCNews and WangYi respectively. Meanwhile, compared with the single model based on character-level and word-level for CNN, the classification accuracies of C_BiGRU_ATT is increased by 1.6%, 2.7% on the THUCNews, and is increased by 0.6%, 5.2% on the WangYi. The results show that the proposed model C_BiGRU_ATT can extract text features more effectively.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] A CHINESE CHARACTER-LEVEL AND WORD-LEVEL COMPLEMENTARY TEXT CLASSIFICATION METHOD
    Chen, Wentong
    Fan, Chunxiao
    Wu, Yuexin
    Lou, Zhixiong
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI 2020), 2020, : 187 - 192
  • [2] Integrating Character-level and Word-level Representation for Affect in Arabic Tweets
    Alharbi, Abdullah I.
    Smith, Phillip
    Lee, Mark
    [J]. Data and Knowledge Engineering, 2022, 138
  • [3] Integrating Character-level and Word-level Representation for Affect in Arabic Tweets
    Alharbi, Abdullah, I
    Smith, Phillip
    Lee, Mark
    [J]. DATA & KNOWLEDGE ENGINEERING, 2022, 138
  • [4] Chinese text classification based on character-level CNN and SVM
    Wu, Huaiguang
    Li, Daiyi
    Cheng, Ming
    [J]. International Journal of Intelligent Information and Database Systems, 2019, 12 (03) : 212 - 228
  • [5] A Compact Encoding for Efficient Character-level Deep Text Classification
    Marinho, Wemerson
    Marti, Luis
    Sanchez-Pi, Nayat
    [J]. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [6] Word-Level and Pinyin-Level Based Chinese Short Text Classification
    Sun, Xinjie
    Huo, Xingying
    [J]. IEEE ACCESS, 2022, 10 : 125552 - 125563
  • [7] Character-level Convolutional Networks for Text Classification
    Zhang, Xiang
    Zhao, Junbo
    Yann Lecun
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [8] Character-level Adversarial Samples Generation Approach for Chinese Text Classification
    Zhang, Shunxiang
    Wu, Houyue
    Zhu, Guangli
    Xu, Xin
    Su, Mingxing
    [J]. JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2023, 45 (06) : 2226 - 2235
  • [9] Character-level Neural Networks for Short Text Classification
    Liu, Jingxue
    Meng, Fanrong
    Zhou, Yong
    Liu, Bing
    [J]. 2017 INTERNATIONAL SMART CITIES CONFERENCE (ISC2), 2017,
  • [10] OPEN VOCABULARY HANDWRITING RECOGNITION USING COMBINED WORD-LEVEL AND CHARACTER-LEVEL LANGUAGE MODELS
    Kozielski, Michal
    Rybach, David
    Hahn, Stefan
    Schlueter, Ralf
    Ney, Hermann
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8257 - 8261