共 50 条
- [22] Depth-Aware Vision-and-Language Navigation using Scene Query Attention Network 2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 9390 - 9396
- [24] A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports 2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 1999 - 2004
- [28] A Dual Semantic-Aware Recurrent Global-Adaptive Network for Vision-and-Language Navigation PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 1479 - 1487
- [30] A Framework for Vision-Language Warm-up Tasks in Multimodal Dialogue Models 2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 2789 - 2799