共 49 条
- [31] InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
- [32] Towards Multimodal Vision-Language Models Generating Non-generic Text THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 13138 - 13139
- [33] EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive Pruning FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 13899 - 13913
- [34] Task-Oriented Multi-Modal Mutual Learning for Vision-Language Models 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21902 - 21912
- [37] Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning 2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 13428 - 13437
- [38] TAGGAR: General-Purpose Task Guidance from Natural Language in Augmented Reality using Vision-Language Models PROCEEDINGS OF THE 2024 ACM SYMPOSIUM ON SPATIAL USER INTERACTION, SUI 2024, 2024,
- [39] Experiential Views: Towards Human Experience Evaluation of Designed Spaces using Vision-Language Models EXTENDED ABSTRACTS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2024, 2024,
- [40] CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22200 - 22210