共 50 条
- [1] MULTIMODAL SEMANTIC ATTENTION NETWORK FOR VIDEO CAPTIONING [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1300 - 1305
- [2] Learning Semantic Concepts and Temporal Alignment for Narrated Video Procedural Captioning [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 4337 - 4345
- [3] Multimodal-enhanced hierarchical attention network for video captioning [J]. Multimedia Systems, 2023, 29 : 2469 - 2482
- [8] Multirate Multimodal Video Captioning [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1877 - 1882
- [9] Semantic Tag Augmented XlanV Model for Video Captioning [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4818 - 4822