共 50 条
- [3] Multi-modal Dense Video Captioning [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 4117 - 4126
- [4] MULTI-MODAL HIERARCHICAL ATTENTION-BASED DENSE VIDEO CAPTIONING [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 475 - 479
- [5] Watch, Listen and Tell: Multi-modal Weakly Supervised Dense Event Captioning [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8907 - 8916
- [6] Multi-modal Dependency Tree for Video Captioning [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
- [8] MRCap: Multi-modal and Multi-level Relationship-based Dense Video Captioning [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2615 - 2620