共 50 条
- [1] Masked Vision-language Transformer in Fashion [J]. Machine Intelligence Research, 2023, 20 : 421 - 434
- [2] TVLT: Textless Vision-Language Transformer [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
- [3] Vision-Language Transformer and Query Generation for Referring Segmentation [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 16301 - 16310
- [6] MAGVLT: Masked Generative Vision-and-Language Transformer [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23338 - 23348
- [7] FashionVLP: Vision Language Transformer for Fashion Retrieval with Feedback [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 14085 - 14095
- [8] Target-Driven Structured Transformer Planner for Vision-Language Navigation [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4194 - 4203
- [9] Unifying Vision-Language Representation Space with Single-Tower Transformer [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 980 - 988
- [10] Kaleido-BERT: Vision-Language Pre-training on Fashion Domain [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12642 - 12652