TLC-XML: Transformer with Label Correlation for Extreme Multi-label Text Classification

被引:1
|
作者
Zhao, Fei [1 ]
Ai, Qing [1 ]
Li, Xiangna [2 ]
Wang, Wenhui [3 ,4 ]
Gao, Qingyun [1 ]
Liu, Yichun [1 ]
机构
[1] Univ Sci & Technol Liaoning, Sch Comp Sci & Software Engn, Anshan 114051, Peoples R China
[2] State Grid Corp China, State Grid Informat & Telecommun Grp Co Ltd, Beijing 100053, Peoples R China
[3] Chinese Acad Sci, Beijing Synchrotron Radiat Facil, Beijing 100049, Peoples R China
[4] Chinese Acad Sci, Chinese Spallat Neutron Source Sci Ctr, Dongguan 523808, Peoples R China
关键词
Extreme multi-label text classification; Label correlation; Graph convolutional network; Transformer model;
D O I
10.1007/s11063-024-11460-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Extreme multi-label text classification (XMTC) annotates related labels for unknown text from large-scale label sets. Transformer-based methods have become the dominant approach for solving the XMTC task due to their effective text representation capabilities. However, the existing Transformer-based methods fail to effectively exploit the correlation between labels in the XMTC task. To address this shortcoming, we propose a novel model called TLC-XML, i.e., a Transformer with label correlation for extreme multi-label text classification. TLC-XML comprises three modules: Partition, Matcher and Ranker. In the Partition module, we exploit the semantic and co-occurrence information of labels to construct the label correlation graph, and further partition the strongly correlated labels into the same cluster. In the Matcher module, we propose cluster correlation learning, which uses the graph convolutional network (GCN) to extract the correlation between clusters. We then introduce these valuable correlations into the classifier to match related clusters. In the Ranker module, we propose label interaction learning, which aggregates the raw label prediction with the information of the neighboring labels. The experimental results on benchmark datasets show that TLC-XML significantly outperforms state-of-the-art XMTC methods.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] LABEL CORRELATION MIXTURE MODEL FOR MULTI-LABEL TEXT CATEGORIZATION
    He, Zhiyang
    Wu, Ji
    Lv, Ping
    [J]. 2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 83 - 88
  • [22] MULTI-LABEL TEXT CLASSIFICATION WITH A ROBUST LABEL DEPENDENT REPRESENTATION
    Alfaro, Rodrigo
    Allende, Hector
    [J]. 2011 INTERNATIONAL CONFERENCE ON INSTRUMENTATION, MEASUREMENT, CIRCUITS AND SYSTEMS (ICIMCS 2011), VOL 3: COMPUTER-AIDED DESIGN, MANUFACTURING AND MANAGEMENT, 2011, : 211 - 214
  • [23] Enhancing Label Correlation Feedback in Multi-Label Text Classification via Multi-Task Learning
    Zhang, Ximing
    Zhang, Qian-Wen
    Yan, Zhao
    Liu, Ruifang
    Cao, Yunbo
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1190 - 1200
  • [24] Long-tailed Extreme Multi-label Text Classification by the Retrieval of Generated Pseudo Label Descriptions
    Zhang, Ruohong
    Wang, Yau-Shian
    Yang, Yiming
    Yu, Donghan
    Vu, Tom
    Lei, Likun
    [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1092 - 1106
  • [25] Label-Aware Document Representation via Hybrid Attention for Extreme Multi-Label Text Classification
    Huang, Xin
    Chen, Boli
    Xiao, Lin
    Yu, Jian
    Jing, Liping
    [J]. NEURAL PROCESSING LETTERS, 2022, 54 (05) : 3601 - 3617
  • [26] Label-Aware Document Representation via Hybrid Attention for Extreme Multi-Label Text Classification
    Xin Huang
    Boli Chen
    Lin Xiao
    Jian Yu
    Liping Jing
    [J]. Neural Processing Letters, 2022, 54 : 3601 - 3617
  • [27] Extreme Learning Machine for Multi-Label Classification
    Sun, Xia
    Xu, Jingting
    Jiang, Changmeng
    Feng, Jun
    Chen, Su-Shing
    He, Feijuan
    [J]. ENTROPY, 2016, 18 (06)
  • [28] Reweighting Forest for Extreme Multi-label Classification
    Lin, Zhun-Zheng
    Dai, Bi-Ru
    [J]. BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2017, 2017, 10440 : 286 - 299
  • [29] Extreme Multi-label Classification for Information Retrieval
    Dembczynski, Krzysztof
    Babbar, Rohit
    [J]. ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018), 2018, 10772 : 839 - 840
  • [30] Multi-Label Classification with Extreme Learning Machine
    Kongsorot, Yanika
    Horata, Punyaphol
    [J]. 2014 6TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST), 2014, : 81 - 86