共 50 条
- [2] A Framework for Video-Text Retrieval with Noisy Supervision [J]. PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2022, 2022, : 373 - 383
- [3] Multi-event Video-Text Retrieval [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22056 - 22066
- [4] Unified Coarse-to-Fine Alignment for Video-Text Retrieval [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2804 - 2815
- [5] Boosting Video-Text Retrieval with Explicit High-Level Semantics [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4887 - 4898
- [6] CLIP Based Multi-Event Representation Generation for Video-Text Retrieval [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (09): : 2169 - 2179
- [8] Tagging before Alignment: Integrating Multi-Modal Tags for Video-Text Retrieval [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 396 - 404
- [9] A multi-level framework for video shot structuring [J]. IMAGE ANALYSIS AND RECOGNITION, 2005, 3656 : 167 - 173