CLIPMulti: Explore the performance of multimodal enhanced CLIP for zero-shot text classification

被引:0
|
作者
Wang, Peng [1 ,2 ]
Li, Dagang [1 ,2 ]
Hu, Xuesi [1 ,2 ,3 ]
Wang, Yongmei [1 ,2 ,4 ]
Zhang, Youhua [1 ,2 ]
机构
[1] Anhui Agr Univ, Sch Informat & Artificial Intelligence, Hefei, Anhui, Peoples R China
[2] Macau Univ Sci & Technol, Fac Innovat Engn, Sch Comp Sci & Engn, Ave Wai Long, Taipa 999078, Macao, Peoples R China
[3] Anhui Agr Univ, Sch Econ & Management, Hefei, Anhui, Peoples R China
[4] Anhui Prov Engn Lab Beidou Precis Agr Informat, Hefei, Anhui, Peoples R China
来源
关键词
Zero-shot text classification; CLIP; Multimodality;
D O I
10.1016/j.csl.2024.101748
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Zero-shot text classification does not require large amounts of labeled data and is designed to handle text classification tasks that lack annotated training data. Existing zero-shot text classification uses either a text-text matching paradigm or a text-image matching paradigm, which shows good performance on different benchmark datasets. However, the existing classification paradigms only consider a single modality for text matching, and little attention is paid to the help of multimodality for text classification. In order to incorporate multimodality into zero-shot text classification, we propose a multimodal enhanced CLIP framework (CLIPMulti), which employs a text-image&text matching paradigm to enhance the effectiveness of zero-shot text classification. Three different image and text combinations are tested for their effects on zero-shot text classification, and a matching method (Match-CLIPMulti) is further proposed to find the corresponding text based on the classified images automatically. We conducted experiments on seven publicly available zero-shot text classification datasets and achieved competitive performance. In addition, we analyzed the effect of different parameters on the Match-CLIPMulti experiments. We hope this work will bring more thoughts and explorations on multimodal fusion in language tasks.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Online Zero-Shot Classification with CLIP
    Qian, Qi
    Hu, Juhua
    COMPUTER VISION - ECCV 2024, PT LXXVII, 2024, 15135 : 462 - 477
  • [2] Zero-Shot Turkish Text Classification
    Birim, Ahmet
    Erden, Mustafa
    Arslan, Levent M.
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [3] Multimodal Ensembling for Zero-Shot Image Classification
    Hickmon, Javon
    THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23747 - 23749
  • [4] Retrieval Augmented Zero-Shot Text Classification
    Abdullahi, Tassallah
    Singh, Ritambhara
    Eickhoff, Carsten
    PROCEEDINGS OF THE 2024 ACM SIGIR INTERNATIONAL CONFERENCE ON THE THEORY OF INFORMATION RETRIEVAL, ICTIR 2024, 2024, : 195 - 203
  • [5] Label Augmentation for Zero-Shot Hierarchical Text Classification
    Paletto, Lorenzo
    Basile, Valerio
    Esposito, Roberto
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 7697 - 7706
  • [6] Unified benchmark for zero-shot Turkish text classification
    celik, Emrecan
    Dalyan, Tugba
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (03)
  • [7] Extreme Zero-Shot Learning for Extreme Text Classification
    Xiong, Yuanhao
    Chang, Wei-Cheng
    Hsieh, Cho-Jui
    Yu, Hsiang-Fu
    Dhillon, Inderjit
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 5455 - 5468
  • [8] Learn to Adapt for Generalized Zero-Shot Text Classification
    Zhang, Yiwen
    Yuan, Caixia
    Wang, Xiaojie
    Bai, Ziwei
    Liu, Yongbin
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 517 - 527
  • [9] CLIPTEXT: A New Paradigm for Zero-shot Text Classification
    Qin, Libo
    Wang, Weiyun
    Chen, Qiguang
    Che, Wanxiang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 1077 - 1088
  • [10] Generalized Zero-Shot Text Classification for ICD Coding
    Song, Congzheng
    Zhang, Shanghang
    Sadoughi, Najmeh
    Xie, Pengtao
    Xing, Eric
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 4018 - 4024