Supervised Contrast Learning Text Classification Model Based on DataQuality Augmentation

被引:0
|
作者
Wu, Liang [1 ]
Zhang, Fangfang [1 ]
Cheng, Chao [1 ]
Song, Shinan [1 ]
机构
[1] Changchun Univ Technol, Sch Comp Sci & Engn, Changchun 130012, Peoples R China
关键词
Text augmentation; data quality; text classification; contrast learning;
D O I
10.1145/3653300
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Token-level data augmentation generates text samples by modifying the words of the sentences. However, data that are not easily classified can negatively affect the model. In particular, not considering the role of keywords when performing random augmentation operations on samples may lead to the generation of low-quality supplementary samples. Therefore, we propose a supervised contrast learning text classification model based on data quality augmentation. First, dynamic training is used to screen high-quality datasets containing beneficial information for model training. The selected data is then augmented with data based on important words with tag information. To obtain a better text representation to serve the downstream classification task, we employ a standard supervised contrast loss to train the model. Finally, we conduct experiments on five text classification datasets to validate the effectiveness of our model. In addition, ablation experiments are conducted to verify the impact of each module on classification.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] VDCL : A supervised text classification method based on virtual adversarial and contrast learning
    Dou, Ximeng
    Zhao, Jing
    Li, Ming
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [2] TEXT CLASSIFICATION BASED ON SEMI-SUPERVISED LEARNING
    Vo Duy Thanh
    Vo Trung Hung
    Pham Minh Tuan
    Doan Van Ban
    [J]. 2013 INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2013, : 232 - 236
  • [3] Contrastive learning with text augmentation for text classification
    Jia, Ouyang
    Huang, Huimin
    Ren, Jiaxin
    Xie, Luodi
    Xiao, Yinyin
    [J]. APPLIED INTELLIGENCE, 2023, 53 (16) : 19522 - 19531
  • [4] Contrastive learning with text augmentation for text classification
    Ouyang Jia
    Huimin Huang
    Jiaxin Ren
    Luodi Xie
    Yinyin Xiao
    [J]. Applied Intelligence, 2023, 53 : 19522 - 19531
  • [5] TABAS: Text augmentation based on attention score for text classification model
    Yu, Yeong Jae
    Yoon, Seung Joo
    Jun, So Young
    Kim, Jong Woo
    [J]. ICT EXPRESS, 2022, 8 (04): : 549 - 554
  • [6] Augmentation Learning for Semi-Supervised Classification
    Frommknecht, Tim
    Zipf, Pedro Alves
    Fan, Quanfu
    Shvetsova, Nina
    Kuehne, Hilde
    [J]. PATTERN RECOGNITION, DAGM GCPR 2022, 2022, 13485 : 85 - 98
  • [7] An Optimal Weighting Method in Supervised Learning of Linguistic Model for Text Classification
    Mikawa, Kenta
    Ishida, Takashi
    Goto, Masayuki
    [J]. INDUSTRIAL ENGINEERING AND MANAGEMENT SYSTEMS, 2012, 11 (01): : 87 - 93
  • [8] TextANN: An Improved Text Classification Model Based on Data Augmentation
    Li, Hong
    Yang, Xiaosheng
    Yang, Guoqing
    Ouyang, Xiaogang
    Chen, Yu
    Wang, Xueqing
    [J]. 2018 INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, BIG DATA AND BLOCKCHAIN (ICCBB 2018), 2018, : 160 - 163
  • [9] Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification
    Ren, Shuhuai
    Zhang, Jinchao
    Li, Lei
    Sun, Xu
    Zhou, Jie
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9029 - 9043
  • [10] Contrastive learning based on linguistic knowledge and adaptive augmentation for text classification
    Zhang, Shaokang
    Ran, Ning
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 300