共 50 条
- [31] Image as a Foreign Language: BEIT Pretraining for Vision and Vision-Language Tasks 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19175 - 19186
- [32] Reinforcement Learning Friendly Vision-Language Model for Minecraft COMPUTER VISION - ECCV 2024, PT XXXVII, 2025, 15095 : 1 - 17
- [33] A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-Language Model COMPUTER VISION, ECCV 2022, PT XXIX, 2022, 13689 : 736 - 753
- [34] Visual In-Context Learning for Large Vision-Language Models FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 15890 - 15902
- [35] Learning the Visualness of Text Using Large Vision-Language Models 2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 2394 - 2408
- [36] Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2691 - 2700
- [37] PROMETHEUS- VISION: Vision-Language Model as a Judge for Fine-Grained Evaluation FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 11286 - 11315
- [40] Towards Better Vision-Inspired Vision-Language Models 2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 13537 - 13547