Taming Pretrained Transformers for Extreme Multi-label Text Classification

被引:116
|
作者
Chang, Wei-Cheng [1 ]
Yu, Hsiang-Fu [2 ]
Zhong, Kai [2 ]
Yang, Yiming [1 ]
Dhillon, Inderjit S. [2 ,3 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Amazon, Bellevue, WA USA
[3] UT Austin, Austin, TX USA
关键词
Transformer models; eXtreme Multi-label text classification;
D O I
10.1145/3394486.3403368
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider the extreme multi-label text classification (XMC) problem: given an input text, return the most relevant labels from a large label collection. For example, the input text could be a product description on Amazon.com and the labels could be product categories. XMC is an important yet challenging problem in the NLP community. Recently, deep pretrained transformer models have achieved state-of-the-art performance on many NLP tasks including sentence classification, albeit with small label sets. However, naively applying deep transformer models to the XMC problem leads to sub-optimal performance due to the large output space and the label sparsity issue. In this paper, we propose X-Transformer, the first scalable approach to fine-tuning deep transformer models for the XMC problem. The proposed method achieves new state-of-the-art results on four XMC benchmark datasets. In particular, on a Wiki dataset with around 0.5 million labels, the prec@1 of X-Transformer is 77.28%, a substantial improvement over state-of-the-art XMC approaches Parabel (linear) and AttentionXML (neural), which achieve 68.70% and 76.95% precision@1, respectively. We further apply X-Transformer to a product2query dataset from Amazon and gained 10.7% relative improvement on prec@1 over Parabel.
引用
收藏
页码:3163 / 3171
页数:9
相关论文
共 50 条
  • [41] Multi-Aspect co-Attentional Collaborative Filtering for extreme multi-label text classification
    Wang, Jiyao
    Chen, Zijie
    Qin, Yang
    He, Dengbo
    Lin, Fangzhen
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 260
  • [42] Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification
    Zhang, Jiong
    Chang, Wei-cheng
    Yu, Hsiang-fu
    Dhillon, Inderjit S.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [43] Research on Multi-Classification and Multi-Label in Text Categorization
    Hua, Liu
    [J]. 2009 INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS, VOL 2, PROCEEDINGS, 2009, : 86 - 89
  • [44] Exploring Transformers for Multi-Label Classification of Java']Java Vulnerabilities
    Mamede, Claudia
    Pinconschi, Eduard
    Abreu, Rui
    Campos, Jose
    [J]. 2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2022, : 43 - 52
  • [45] Data scarcity, robustness and extreme multi-label classification
    Rohit Babbar
    Bernhard Schölkopf
    [J]. Machine Learning, 2019, 108 : 1329 - 1351
  • [46] Sparse Local Embeddings for Extreme Multi-label Classification
    Bhatia, Kush
    Jain, Himanshu
    Kar, Purushottam
    Varma, Manik
    Jain, Prateek
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [47] Data scarcity, robustness and extreme multi-label classification
    Babbar, Rohit
    Schoelkopf, Bernhard
    [J]. MACHINE LEARNING, 2019, 108 (8-9) : 1329 - 1351
  • [48] Multi-label Classification with Clustering for Image and Text Categorization
    Nasierding, Gulisong
    Sajjanhar, Atul
    [J]. 2013 6TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING (CISP), VOLS 1-3, 2013, : 869 - 874
  • [49] Multi-label text classification with an ensemble feature space
    Tandon, Kushagri
    Chatterjee, Niladri
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (05) : 4425 - 4436
  • [50] A Combined Approach for Multi-Label Text Data Classification
    Strimaitis, Rokas
    Stefanovic, Pavel
    Ramanauskaite, Simona
    Slotkiene, Asta
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022