共 50 条
- [1] HierVL: Learning Hierarchical Video-Language Embeddings 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23066 - 23078
- [2] Exploring Temporal Concurrency for Video-Language Representation Learning 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15522 - 15532
- [4] Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
- [5] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
- [6] Learning Trajectory-Word Alignments for Video-Language Tasks 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2504 - 2514
- [7] LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23119 - 23129
- [8] Depth-Aware Sparse Transformer for Video-Language Learning PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4778 - 4787
- [9] UniVTG: Towards Unified Video-Language Temporal Grounding 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2782 - 2792
- [10] Probabilistic Representations for Video Contrastive Learning 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 14691 - 14701