共 50 条
- [31] 3D Vision and Language Pretraining with Large-Scale Synthetic Data PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 1552 - 1560
- [33] Free-Form Instruction Guided Robotic Navigation Path Planning with Large Vision-Language Model INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2024, PT IX, 2025, 15209 : 381 - 396
- [35] Understanding Contexts Inside Robot and Human Manipulation Tasks through Vision-Language Model and Ontology System in Video Streams 2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 8366 - 8372
- [39] Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15005 - 15015