共 50 条
- [1] Text-Video Retrieval via Multi-Modal Hypergraph Networks PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 369 - 377
- [2] Multi-Modal Representation Learning with Text-Driven Soft Masks 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2798 - 2807
- [3] UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3032 - 3041
- [4] MIM: LIGHTWEIGHT MULTI-MODAL INTERACTION MODEL FOR JOINT VIDEO MOMENT RETRIEVAL AND HIGHLIGHT DETECTION 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1961 - 1966
- [5] CRET: Cross-Modal Retrieval Transformer for Efficient Text-Video Retrieval PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 949 - 959
- [8] Tagging before Alignment: Integrating Multi-Modal Tags for Video-Text Retrieval THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 396 - 404
- [9] VTLayout: A Multi-Modal Approach for Video Text Layout PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2775 - 2784