共 50 条
- [2] LAVT: Language-Aware Vision Transformer for Referring Image Segmentation 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18134 - 18144
- [4] Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17457 - 17466
- [6] CARIS: Context-Aware Referring Image Segmentation PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 779 - 788
- [7] Bottom-Up Shift and Reasoning for Referring Image Segmentation 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11261 - 11270
- [8] SLViT: Scale-Wise Language-Guided Vision Transformer for Referring Image Segmentation PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 1294 - 1302
- [9] Visionary: vision-aware enhancement with reminding scenes generated by captions via multimodal transformer for embodied referring expression VISUAL COMPUTER, 2024, : 1673 - 1688
- [10] Vision-Language Transformer and Query Generation for Referring Segmentation 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 16301 - 16310