Supervised Contrastive Learning with Term Weighting for Improving Chinese Text Classification

被引:4
|
作者
Guo, Jiabao [1 ]
Zhao, Bo [1 ]
Liu, Hui [1 ]
Liu, Yifan [1 ]
Zhong, Qian [1 ]
机构
[1] Wuhan Univ, Sch Cyber Sci & Engn, Wuhan 430000, Peoples R China
基金
中国国家自然科学基金;
关键词
Chinese text classification; Supervised Contrastive Learning (SCL); Term Weighting (TW); Temporal Convolution Network (TCN); MODEL;
D O I
10.26599/TST.2021.9010079
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid growth of information retrieval technology, Chinese text classification, which is the basis of information content security, has become a widely discussed topic. In view of the huge difference compared with English, Chinese text task is more complex in semantic information representations. However, most existing Chinese text classification approaches typically regard feature representation and feature selection as the key points, but fail to take into account the learning strategy that adapts to the task. Besides, these approaches compress the Chinese word into a representation vector, without considering the distribution of the term among the categories of interest. In order to improve the effect of Chinese text classification, a unified method, called Supervised Contrastive Learning with Term Weighting (SCL-TW), is proposed in this paper. Supervised contrastive learning makes full use of a large amount of unlabeled data to improve model stability. In SCL-TW, we calculate the score of term weighting to optimize the process of data augmentation of Chinese text. Subsequently, the transformed features are fed into a temporal convolution network to conduct feature representation. Experimental verifications are conducted on two Chinese benchmark datasets. The results demonstrate that SCL-TW outperforms other advanced Chinese text classification approaches by an amazing margin.
引用
收藏
页码:59 / 68
页数:10
相关论文
共 50 条
  • [1] On Term Frequency Factor in Supervised Term Weighting Schemes for Text Classification
    Dogan, Turgut
    Uysal, Alper Kursat
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2019, 44 (11) : 9545 - 9560
  • [2] On Term Frequency Factor in Supervised Term Weighting Schemes for Text Classification
    Turgut Dogan
    Alper Kursat Uysal
    [J]. Arabian Journal for Science and Engineering, 2019, 44 : 9545 - 9560
  • [3] An improved supervised term weighting scheme for text representation and classification
    Tang, Zhong
    Li, Wenqiang
    Li, Yan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 189
  • [4] Supervised term-category feature weighting for improved text classification
    Attieh, Joseph
    Tekli, Joe
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 261
  • [5] Structure-Based Supervised Term Weighting and Regularization for Text Classification
    Shanavas, Niloofer
    Wang, Hui
    Lin, Zhiwei
    Hawe, Glenn
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2019), 2019, 11608 : 105 - 117
  • [6] An Optimal Weighting Method in Supervised Learning of Linguistic Model for Text Classification
    Mikawa, Kenta
    Ishida, Takashi
    Goto, Masayuki
    [J]. INDUSTRIAL ENGINEERING AND MANAGEMENT SYSTEMS, 2012, 11 (01): : 87 - 93
  • [7] Supervised Graph-Based Term Weighting Scheme for Effective Text Classification
    Shanavas, Niloofer
    Wang, Hui
    Lin, Zhiwei
    Hawe, Glenn
    [J]. ECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, 285 : 1710 - 1711
  • [8] Supervised Contrastive Learning for Product Classification
    Azizi, Sahel
    Fang, Uno
    Adibi, Sasan
    Li, Jianxin
    [J]. ADVANCED DATA MINING AND APPLICATIONS, ADMA 2021, PT II, 2022, 13088 : 341 - 355
  • [9] Supervised term weighting for automated text categorization
    Debole, F
    Sebastiani, F
    [J]. TEXT MINING AND ITS APPLICATIONS, 2004, 138 : 81 - 97
  • [10] Contrastive learning with text augmentation for text classification
    Jia, Ouyang
    Huang, Huimin
    Ren, Jiaxin
    Xie, Luodi
    Xiao, Yinyin
    [J]. APPLIED INTELLIGENCE, 2023, 53 (16) : 19522 - 19531