CoLAL: Co-learning Active Learning for Text Classification

被引:0
|
作者
Le, Linh [1 ]
Zhao, Genghong [2 ]
Zhang, Xia [3 ]
Zuccon, Guido [1 ]
Demartini, Gianluca [1 ]
机构
[1] Univ Queensland, St Lucia, Qld, Australia
[2] Neusoft Res Intelligent Healthcare Technol Co Ltd, Shenyang, Peoples R China
[3] Neusoft Corp, Shenyang, Peoples R China
基金
瑞士国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the machine learning field, the challenge of effectively learning with limited data has become increasingly crucial. Active Learning (AL) algorithms play a significant role in this by enhancing model performance. We introduce a novel AL algorithm, termed Co-learning (CoLAL), designed to select the most diverse and representative samples within a training dataset. This approach utilizes noisy labels and predictions made by the primary model on unlabeled data. By leveraging a probabilistic graphical model, we combine two multi-class classifiers into a binary one. This classifier determines if both the main and the peer models agree on a prediction. If they do, the unlabeled sample is assumed to be easy to classify and is thus not beneficial to increase the target model's performance. We prioritize data that represents the unlabeled set without overlapping decision boundaries. The discrepancies between these boundaries can be estimated by the probability that two models result in the same prediction. Through theoretical analysis and experimental validation, we reveal that the integration of noisy labels into the peer model effectively identifies target model's potential inaccuracies. We evaluated the CoLAL method across seven benchmark datasets: four text datasets (AGNews, DBPedia, PubMed, SST-2) and text-based state-of-the-art (SOTA) baselines, and three image datasets (CIFAR100, MNIST, OpenML-155) and computer vision SOTA baselines. The results show that our CoLAL method significantly outperforms existing SOTA in text-based AL, and is competitive with SOTA image-based AL techniques.
引用
收藏
页码:13337 / 13345
页数:9
相关论文
共 50 条
  • [1] Dual Adversarial Co-Learning for Multi-Domain Text Classification
    Wu, Yuan
    Guo, Yuhong
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6438 - 6445
  • [2] Text classification with active learning
    Novak, B
    Mladenic, D
    Grobelnik, M
    FROM DATA AND INFORMATION ANALYSIS TO KNOWLEDGE ENGINEERING, 2006, : 398 - +
  • [3] Model Metric Co-Learning for Time Series Classification
    Chen, Huanhuan
    Tang, Fengzhen
    Tino, Peter
    Cohn, Anthony G.
    Yao, Xin
    PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 3387 - 3394
  • [4] Democratic co-learning
    Zhou, Y
    Goldman, S
    ICTAI 2004: 16TH IEEE INTERNATIONALCONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, : 594 - 602
  • [5] Progress with co-learning
    Karamanos, Y.
    Mateos, A.
    Mysiorek, C.
    Saint-Pol, J.
    Berger, S.
    FEBS OPEN BIO, 2019, 9 : 408 - 408
  • [6] Active learning for text classification with reusability
    Hu, Rong
    Mac Namee, Brian
    Delany, Sarah Jane
    EXPERT SYSTEMS WITH APPLICATIONS, 2016, 45 : 438 - 449
  • [7] Active Learning for Turkish Text Classification
    Sapci, Ali Osman Berk
    Tastan, Oznur
    Yeniterzi, Reyyan
    2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [8] Deep Active Learning for Text Classification
    An, Bang
    Wu, Wenjun
    Han, Huimin
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON VISION, IMAGE AND SIGNAL PROCESSING (ICVISP 2018), 2018,
  • [9] Co-Learning for Few-Shot Learning
    Xu, Rui
    Xing, Lei
    Shao, Shuai
    Liu, Baodi
    Zhang, Kai
    Liu, Weifeng
    NEURAL PROCESSING LETTERS, 2022, 54 (04) : 3339 - 3356
  • [10] Active Learning Based on Transfer Learning Techniques for Text Classification
    Onita, Daniela
    IEEE ACCESS, 2023, 11 : 28751 - 28761