共 50 条
- [31] Discriminative Latent Semantic Graph for Video Captioning [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3556 - 3564
- [33] STSI: Efficiently Mine Spatio-Temporal Semantic Information between Different Multimodal for Video Captioning [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
- [35] Multimodal attention-based transformer for video captioning [J]. Applied Intelligence, 2023, 53 : 23349 - 23368
- [36] Multimodal graph neural network for video procedural captioning [J]. NEUROCOMPUTING, 2022, 488 : 88 - 96
- [37] Multimodal attention-based transformer for video captioning [J]. APPLIED INTELLIGENCE, 2023, 53 (20) : 23349 - 23368
- [39] Learning Multimodal Attention LSTM Networks for Video Captioning [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 537 - 545
- [40] Hierarchical Vision-Language Alignment for Video Captioning [J]. MULTIMEDIA MODELING (MMM 2019), PT I, 2019, 11295 : 42 - 54