共 50 条
- [1] Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 13238 - 13246
- [2] Cross-Modal Attribute Insertions for Assessing the Robustness of Vision-and-Language Learning [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15974 - 15990
- [3] A Cross-Modal Object-Aware Transformer for Vision-and-Language Navigation [J]. 2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 976 - 981
- [4] Cross-modal Semantic Alignment Pre-training for Vision-and-Language Navigation [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4233 - 4241
- [6] Topological Planning with Transformers for Vision-and-Language Navigation [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11271 - 11281
- [7] Cross-modal Map Learning for Vision and Language Navigation [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15439 - 15449
- [8] UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4153 - 4163
- [10] Transformer-Exclusive Cross-Modal Representation for Vision and Language [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2719 - 2725