共 50 条
- [1] Structured Scene Memory for Vision-Language Navigation [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8451 - 8460
- [2] e-ViL: A Dataset and Benchmark for Natural Language Explanations in Vision-Language Tasks [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1224 - 1234
- [3] Unsupervised Vision-Language Parsing: Seamlessly Bridging Visual Scene Graphs with Language Structures via Dependency Relationships [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15586 - 15595
- [4] Language Features Matter: Effective Language Representations for Vision-Language Tasks [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7473 - 7482
- [5] VinVL: Revisiting Visual Representations in Vision-Language Models [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5575 - 5584
- [6] LifeGraph 4-Lifelog Retrieval using Multimodal Knowledge Graphs and Vision-Language Models [J]. PROCEEDINGS OF 2024 ACM WORKSHOP ON THE LIFELOG SEARCH CHALLENGE, LSC 2024, 2024, : 88 - 92
- [7] Are Vision-Language Transformers Learning Multimodal Representations? A Probing Perspective [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11248 - 11257
- [8] "This Is My Unicorn, Fluffy": Personalizing Frozen Vision-Language Representations [J]. COMPUTER VISION, ECCV 2022, PT XX, 2022, 13680 : 558 - 577
- [9] FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2669 - 2680
- [10] Vision-Language Pre-Training for Boosting Scene Text Detectors [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15660 - 15670