共 50 条
- [21] Unified Vision-Language Pre-Training for Image Captioning and VQA THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13041 - 13049
- [22] Multimodal Pre-training Method for Vision-language Understanding and Generation Ruan Jian Xue Bao/Journal of Software, 2023, 34 (05): : 2024 - 2034
- [23] Towards Adversarial Attack on Vision-Language Pre-training Models PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5005 - 5013
- [25] Position-guided Text Prompt for Vision-Language Pre-training 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23242 - 23251
- [26] Knowledge Boosting: Rethinking Medical Contrastive Vision-Language Pre-training MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT I, 2023, 14220 : 405 - 415
- [27] Vision-Language Pre-Training: Basics, Recent Advances, and Future Trends FOUNDATIONS AND TRENDS IN COMPUTER GRAPHICS AND VISION, 2022, 14 (3-4): : 163 - 352
- [28] Kaleido-BERT: Vision-Language Pre-training on Fashion Domain 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12642 - 12652
- [29] Subsampling of Frequent Words in Text for Pre-training a Vision-Language Model PROCEEDINGS OF THE 1ST WORKSHOP ON LARGE GENERATIVE MODELS MEET MULTIMODAL APPLICATIONS, LGM3A 2023, 2023, : 61 - 67
- [30] EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,