InfoCL: Alleviating Catastrophic Forgetting in Continual Text Classification from An Information Theoretic Perspective

被引:0
|
作者
Song, Yifan [1 ]
Wang, Peiyi [1 ]
Xiong, Weimin [1 ]
Zhu, Dawei [1 ]
Liu, Tianyu [2 ]
Sui, Zhifang [1 ]
Li, Sujian [1 ]
机构
[1] Peking Univ, Sch Comp Sci, Natl Key Lab Multimedia Informat Proc, Beijing, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023) | 2023年
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Continual learning (CL) aims to constantly learn new knowledge over time while avoiding catastrophic forgetting on old tasks. We focus on continual text classification under the class-incremental setting. Recent CL studies have identified the severe performance decrease on analogous classes as a key factor for catastrophic forgetting. In this paper, through an in-depth exploration of the representation learning process in CL, we discover that the compression effect of the information bottleneck leads to confusion on analogous classes. To enable the model learn more sufficient representations, we propose a novel replay-based continual text classification method, InfoCL. Our approach utilizes fast-slow and current-past contrastive learning to perform mutual information maximization and better recover the previously learned representations. In addition, InfoCL incorporates an adversarial memory augmentation strategy to alleviate the overfitting problem of replay. Experimental results demonstrate that InfoCL effectively mitigates forgetting and achieves state-of-the-art performance on three text classification tasks. The code is publicly available at https://github. com/Yifan-Song793/InfoCL.
引用
收藏
页码:14557 / 14570
页数:14
相关论文
共 45 条
  • [21] PDF text classification to leverage information extraction from publication reports
    Duy Duc An Bui
    Del Fiol, Guilherme
    Jonnalagadda, Siddhartha
    JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 61 : 141 - 148
  • [22] Improving Short Text Classification using Information from DBpedia Ontology
    Flisar, Jernej
    Podgorelec, Vili
    FUNDAMENTA INFORMATICAE, 2020, 172 (03) : 261 - 297
  • [23] Information extraction and classification from free text using a neural approach
    Gallo, Ignazio
    Binagbi, Elisabetta
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2007, 4756 : 921 - 929
  • [24] Entropy Converges Between Dialogue Participants: Explanations from an Information-Theoretic Perspective
    Xu, Yang
    Reitter, David
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 537 - 546
  • [25] GDP vs. LDP: A Survey from the Perspective of Information-Theoretic Channel
    Liu, Hai
    Peng, Changgen
    Tian, Youliang
    Long, Shigong
    Tian, Feng
    Wu, Zhenqiang
    ENTROPY, 2022, 24 (03)
  • [26] Domain-Specific Long Text Classification from Sparse Relevant Information
    D'Cruz, Célia
    Bereder, Jean-Marc
    Precioso, Frédéric
    Riveill, Michel
    Frontiers in Artificial Intelligence and Applications, 392 : 4003 - 4010
  • [27] An Information-Theoretic Perspective on Coarse-Graining, Including the Transition from Micro to Macro
    Lindgren, Kristian
    ENTROPY, 2015, 17 (05) : 3332 - 3351
  • [28] MINER: Improving Out-of-Vocabulary Named Entity Recognition from an Information Theoretic Perspective
    Wang, Xiao
    Dou, Shihan
    Xiong, Limao
    Zou, Yicheng
    Zhang, Qi
    Gui, Tao
    Qiao, Liang
    Cheng, Zhanzhan
    Huang, Xuanjing
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5590 - 5600
  • [29] Association Measure and Compact Prediction for Chemical Process Data from an Information-Theoretic Perspective
    Luo, Lei
    He, Ge
    Zhang, Yuequn
    Ji, Xu
    Zhou, Li
    Dai, Yiyang
    Dang, Yagu
    PROCESSES, 2022, 10 (12)
  • [30] Nursing-care Text Classification using Additional Term Information from Web
    Nii, Manabu
    Yamaguchi, Takafumi
    Mori, Yusuke
    Takahashi, Yutaka
    Uchinuno, Atsuko
    Sakashita, Reiko
    IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011), 2011, : 1442 - 1446