共 50 条
- [1] Multi-task Learning of Hierarchical Vision-Language Representation [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10484 - 10493
- [3] CALM-Bench: A Multi-task Benchmark for Evaluating Causality Aware Language Models [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 296 - 311
- [5] Task Residual for Tuning Vision-Language Models [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10899 - 10909
- [8] On Evaluating Adversarial Robustness of Large Vision-Language Models [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
- [9] Task-Oriented Multi-Modal Mutual Learning for Vision-Language Models [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21902 - 21912
- [10] Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language Models [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5674 - 5685