共 50 条
- [2] Cross-modal Semantic Alignment Pre-training for Vision-and-Language Navigation PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4233 - 4241
- [3] Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language Navigation 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 12009 - 12019
- [5] History Aware Multimodal Transformer for Vision-and-Language Navigation ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
- [6] Weakly Supervised Vision-and-Language Pre-training with Relative Representations PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 8341 - 8355
- [7] Unsupervised Vision-and-Language Pre-training Without Parallel Images and Captions 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5339 - 5350
- [8] Align, Reason and Learn: Enhancing Medical Vision-and-Language Pre-training with Knowledge PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5152 - 5161
- [9] Multi-modal Masked Autoencoders for Medical Vision-and-Language Pre-training MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT V, 2022, 13435 : 679 - 689
- [10] Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 23346 - 23356