共 50 条
- [22] Leveraging per Image-Token Consistency for Vision-Language Pre-training [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19155 - 19164
- [23] Vision-Language Pre-Training with Triple Contrastive Learning [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15650 - 15659
- [24] GilBERT: Generative Vision-Language Pre-Training for Image-Text Retrieval [J]. SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1379 - 1388
- [26] BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
- [27] MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23262 - 23271
- [28] Multimodal detection of hateful memes by applying a vision-language pre-training model [J]. PLOS ONE, 2022, 17 (09):
- [29] Vision-Language Pre-Training for Boosting Scene Text Detectors [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15660 - 15670