CLIPMulti: Explore the performance of multimodal enhanced CLIP for zero-shot text classification

被引:0
|
作者
Wang, Peng [1 ,2 ]
Li, Dagang [1 ,2 ]
Hu, Xuesi [1 ,2 ,3 ]
Wang, Yongmei [1 ,2 ,4 ]
Zhang, Youhua [1 ,2 ]
机构
[1] Anhui Agr Univ, Sch Informat & Artificial Intelligence, Hefei, Anhui, Peoples R China
[2] Macau Univ Sci & Technol, Fac Innovat Engn, Sch Comp Sci & Engn, Ave Wai Long, Taipa 999078, Macao, Peoples R China
[3] Anhui Agr Univ, Sch Econ & Management, Hefei, Anhui, Peoples R China
[4] Anhui Prov Engn Lab Beidou Precis Agr Informat, Hefei, Anhui, Peoples R China
来源
关键词
Zero-shot text classification; CLIP; Multimodality;
D O I
10.1016/j.csl.2024.101748
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Zero-shot text classification does not require large amounts of labeled data and is designed to handle text classification tasks that lack annotated training data. Existing zero-shot text classification uses either a text-text matching paradigm or a text-image matching paradigm, which shows good performance on different benchmark datasets. However, the existing classification paradigms only consider a single modality for text matching, and little attention is paid to the help of multimodality for text classification. In order to incorporate multimodality into zero-shot text classification, we propose a multimodal enhanced CLIP framework (CLIPMulti), which employs a text-image&text matching paradigm to enhance the effectiveness of zero-shot text classification. Three different image and text combinations are tested for their effects on zero-shot text classification, and a matching method (Match-CLIPMulti) is further proposed to find the corresponding text based on the classified images automatically. We conducted experiments on seven publicly available zero-shot text classification datasets and achieved competitive performance. In addition, we analyzed the effect of different parameters on the Match-CLIPMulti experiments. We hope this work will bring more thoughts and explorations on multimodal fusion in language tasks.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Prompt-based Zero-shot Text Classification with Conceptual Knowledge
    Wang, Yuqi
    Wang, Wei
    Chen, Qi
    Huang, Kaizhu
    Nguyen, Anh
    De, Suparna
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-SRW 2023, VOL 4, 2023, : 30 - 38
  • [32] CLZT: A Contrastive Learning Based Framework for Zero-Shot Text Classification
    Li, Kun
    Lin, Meng
    Hu, Songlin
    Li, Ruixuan
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT II, 2022, : 623 - 630
  • [33] Zero-Shot Text Classification via Self-Supervised Tuning
    Liu, Chaoqun
    Zhang, Wenxuan
    Chen, Guizhen
    Wu, Xiaobao
    Luu, Anh Tuan
    Chang, Chip Hong
    Bing, Lidong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 1743 - 1761
  • [34] Multi-view enhanced zero-shot node classification
    Wang, Jiahui
    Wu, Likang
    Zhao, Hongke
    Jia, Ning
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (06)
  • [35] Zero-Shot Text Classification with Semantically Extended Graph Convolutional Network
    Liu, Tengfei
    Hu, Yongli
    Gao, Junbin
    Sun, Yanfeng
    Yin, Baocai
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 8352 - 8359
  • [36] The Benefits of Label-Description Training for Zero-Shot Text Classification
    Gao, Lingyu
    Ghosh, Debanjan
    Gimpel, Kevin
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 13823 - 13844
  • [37] Zero-shot Text Classification via Reinforced Self-training
    Ye, Zhiquan
    Geng, Yuxia
    Chen, Jiaoyan
    Xu, Xiaoxiao
    Zheng, Suhang
    Wang, Feng
    Chen, Jingmin
    Zhang, Jun
    Chen, Huajun
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3014 - 3024
  • [38] Comprehensive Study on Zero-Shot Text Classification Using Category Mapping
    Zhang, Kai
    Zhang, Qiuxia
    Wang, Chung-Che
    Jang, Jyh-Shing Roger
    IEEE ACCESS, 2025, 13 : 23526 - 23546
  • [39] ProZe: Explainable and Prompt-Guided Zero-Shot Text Classification
    Harrando, Ismail
    Reboud, Alison
    Schleider, Thomas
    Ehrhart, Thibault
    Troncy, Raphael
    IEEE INTERNET COMPUTING, 2022, 26 (06) : 69 - 77
  • [40] Label Agnostic Pre-training for Zero-shot Text Classification
    Clarke, Christopher
    Heng, Yuzhao
    Kang, Yiping
    Flautner, Krisztian
    Tang, Lingjia
    Mars, Jason
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 1009 - 1021