共 50 条
- [1] Deep multimodal embedding for video captioning [J]. Multimedia Tools and Applications, 2019, 78 : 31793 - 31805
- [2] Multimodal Pretraining for Dense Video Captioning [J]. 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 470 - 490
- [4] Deep multimodal embedding for video captioning [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (22) : 31793 - 31805
- [5] MULTIMODAL SEMANTIC ATTENTION NETWORK FOR VIDEO CAPTIONING [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1300 - 1305
- [6] Video Captioning with Guidance of Multimodal Latent Topics [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1838 - 1846
- [7] Multimodal attention-based transformer for video captioning [J]. Applied Intelligence, 2023, 53 : 23349 - 23368
- [10] Multimodal attention-based transformer for video captioning [J]. APPLIED INTELLIGENCE, 2023, 53 (20) : 23349 - 23368