Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification

被引:0
|
作者
Zhang, Jiong [1 ]
Chang, Wei-cheng [1 ]
Yu, Hsiang-fu [1 ]
Dhillon, Inderjit S. [1 ,2 ]
机构
[1] Amazon, Palo Alto, CA 94303 USA
[2] UT Austin, Austin, TX USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Extreme multi-label text classification (XMC) seeks to find relevant labels from an extreme large label collection for a given text input. Many real-world applications can be formulated as XMC problems, such as recommendation systems, document tagging and semantic search. Recently, transformer based XMC methods, such as X-Transformer and LightXML, have shown significant improvement over other XMC methods. Despite leveraging pre-trained transformer models for text representation, the fine-tuning procedure of transformer models on large label space still has lengthy computational time even with powerful GPUs. In this paper, we propose a novel recursive approach, XR-Transformer to accelerate the procedure through recursively fine-tuning transformer models on a series of multi-resolution objectives related to the original XMC objective function. Empirical results show that XR-Transformer takes significantly less training time compared to other transformer-based XMC models while yielding better state-of-the-art results. In particular, on the public Amazon-3M dataset with 3 million labels, XR-Transformer is not only 20x faster than X-Transformer but also improves the Precision@1 from 51% to 54%. Our code is publicly available at https://github.com/amzn/pecos.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Multi-resolution Fine-Tuning of Vision Transformers
    Fitzgerald, Kerr
    Law, Meng
    Seah, Jarrel
    Tang, Jennifer
    Matuszewski, Bogdan
    [J]. MEDICAL IMAGE UNDERSTANDING AND ANALYSIS, MIUA 2022, 2022, 13413 : 535 - 546
  • [2] TLC-XML: Transformer with Label Correlation for Extreme Multi-label Text Classification
    Zhao, Fei
    Ai, Qing
    Li, Xiangna
    Wang, Wenhui
    Gao, Qingyun
    Liu, Yichun
    [J]. NEURAL PROCESSING LETTERS, 2024, 56 (01)
  • [3] TLC-XML: Transformer with Label Correlation for Extreme Multi-label Text Classification
    Fei Zhao
    Qing Ai
    Xiangna Li
    Wenhui Wang
    Qingyun Gao
    Yichun Liu
    [J]. Neural Processing Letters, 56
  • [4] Deep Learning for Extreme Multi-label Text Classification
    Liu, Jingzhou
    Chang, Wei-Cheng
    Wu, Yuexin
    Yang, Yiming
    [J]. SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 115 - 124
  • [5] Correlation Networks for Extreme Multi-label Text Classification
    Xun, Guangxu
    Jha, Kishlay
    Sun, Jianhui
    Zhang, Aidong
    [J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 1074 - 1082
  • [6] CascadeXML: Rethinking Transformers for End-to-end Multi-resolution Training in Extreme Multi-label Classification
    Kharbanda, Siddhant
    Banerjee, Atmadeep
    Schultheis, Erik
    Babbar, Rohit
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [7] A Hierarchical Fine-Tuning Approach Based on Joint Embedding of Words and Parent Categories for Hierarchical Multi-label Text Classification
    Ma, Yinglong
    Zhao, Jingpeng
    Jin, Beihong
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 : 746 - 757
  • [8] Fine-Tuning BERT for Multi-Label Sentiment Analysis in Unbalanced Code-Switching Text
    Tang, Tiancheng
    Tang, Xinhuai
    Yuan, Tianyi
    [J]. IEEE ACCESS, 2020, 8 (08): : 193248 - 193256
  • [9] Extreme Multi-Label Text Classification Based on Balance Function
    Chen, Zhaohong
    Hong, Zhiyong
    Yu, Wenhua
    Zhang, Xin
    [J]. Computer Engineering and Applications, 2024, 60 (04) : 163 - 172
  • [10] Taming Pretrained Transformers for Extreme Multi-label Text Classification
    Chang, Wei-Cheng
    Yu, Hsiang-Fu
    Zhong, Kai
    Yang, Yiming
    Dhillon, Inderjit S.
    [J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 3163 - 3171