WeStcoin: Weakly-Supervised Contextualized Text Classification with Imbalance and Noisy Labels

被引:1
|
作者
Zhang, Yupei [1 ,2 ]
Zhou, Yaya [1 ,2 ]
Liu, Shuhui [1 ,2 ]
Zhang, Wenxin [1 ,2 ]
Xiao, Min [1 ,2 ]
Shang, Xuequn [1 ,2 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian 710129, Shaanxi, Peoples R China
[2] Minist Ind & Informat Technol, Big Data Storage & Management Lab, Xian 710129, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
SMOTE;
D O I
10.1109/ICPR56361.2022.9956110
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The joint problem of imbalance samples and noisy labels challenges the current text classifiers in real-world applications. Existing approaches are mostly devoted to handling either former or latter while fail to manage the fused issue. This paper introduces a novel weakly-supervised framework, dubbed WeStcoin, to take into account the sensitivity cost on misclassifications between classes and seek seed words towards noisy-label corrections. After BERT that creates a contextualized corpus, WeSt-coin learns a predicted label vector from the contextualized samples and meanwhile calculates a pseudo probability vector from seed words, and then projects the concatenated representation into an output space, followed by multiplying by a cost-sensitive matrix. WeStcoin is ultimately trained to decrease the residual between the model outputs and the noisy labels, where seed words are also updated in an iterative manner. Extensive experiments and ablation studies on two public text datasets demonstrate that the proposed model outperforms the state-of-the-art model in the text classification with imbalance samples and noisy labels. Codes are made available at https://github.com/ypzhaang.
引用
收藏
页码:2451 / 2457
页数:7
相关论文
共 50 条
  • [1] Weakly-Supervised Hierarchical Text Classification
    Meng, Yu
    Shen, Jiaming
    Zhang, Chao
    Han, Jiawei
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6826 - 6833
  • [2] Weakly-Supervised Neural Text Classification
    Meng, Yu
    Shen, Jiaming
    Zhang, Chao
    Han, Jiawei
    [J]. CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 983 - 992
  • [3] Weakly-Supervised Phrase Assignment from Text in a Speech-Synthesis System Using Noisy Labels
    Rendel, Asaf
    Fernandez, Raul
    Kons, Zvi
    Rosenberg, Andrew
    Hoory, Ron
    Ramabhadran, Bhuvana
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 759 - 763
  • [4] Hyperspectral Images Weakly Supervised Classification with Noisy Labels
    Liu, Chengyang
    Zhao, Lin
    Wu, Haibin
    [J]. REMOTE SENSING, 2023, 15 (20)
  • [5] Weakly-supervised Text Classification Based on Keyword Graph
    Zhang, Lu
    Ding, Jiandong
    Xu, Yi
    Liu, Yingyao
    Zhou, Shuigeng
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 2803 - 2813
  • [6] Weakly-supervised multi-label learning with noisy features and incomplete labels
    Sun, Lijuan
    Ye, Ping
    Lyu, Gengyu
    Feng, Songhe
    Dai, Guojun
    Zhang, Hua
    [J]. NEUROCOMPUTING, 2020, 413 : 61 - 71
  • [7] Weakly-supervised Road Condition Classification Using Automatically Generated Labels
    Zhou, Wei
    Cruz, Edmanuel
    Worrall, Stewart
    Gomez-Donoso, Francisco
    Cazorla, Miguel
    Nebot, Eduardo
    [J]. 2020 IEEE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2020,
  • [8] Beyond missing: weakly-supervised multi-label learning with incomplete and noisy labels
    Sun, Lijuan
    Lyu, Gengyu
    Feng, Songhe
    Huang, Xiankai
    [J]. APPLIED INTELLIGENCE, 2021, 51 (03) : 1552 - 1564
  • [9] Beyond missing: weakly-supervised multi-label learning with incomplete and noisy labels
    Lijuan Sun
    Gengyu Lyu
    Songhe Feng
    Xiankai Huang
    [J]. Applied Intelligence, 2021, 51 : 1552 - 1564
  • [10] Weakly-Supervised Text Instance Segmentation
    Zu, Xinyan
    Yu, Haiyang
    Li, Bin
    Xue, Xiangyang
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 1915 - 1923