共 50 条
- [22] MGSGA: Multi-grained and Semantic-Guided Alignment for Text-Video Retrieval Neural Processing Letters, 56
- [23] Deep Video Understanding with a Unified Multi-Modal Retrieval Framework PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 7055 - 7059
- [24] Text-Guided Multi-Modal Fusion for Underwater Visual Tracking 2024 IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE, AVSS 2024, 2024,
- [25] Efficient text-to-video retrieval via multi-modal multi-tagger derived pre-screening Visual Intelligence, 2025, 3 (1):
- [26] MMFusion: A Generalized Multi-Modal Fusion Detection Framework 2023 IEEE INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING, ICDL, 2023, : 415 - 422
- [28] Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4626 - 4636
- [30] Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11727 - 11736