共 50 条
- [3] Learning Multimodal Attention LSTM Networks for Video Captioning [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 537 - 545
- [4] Multirate Multimodal Video Captioning [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1877 - 1882
- [5] Deep multimodal embedding for video captioning [J]. Multimedia Tools and Applications, 2019, 78 : 31793 - 31805
- [6] Multimodal Pretraining for Dense Video Captioning [J]. 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 470 - 490
- [7] Deep multimodal embedding for video captioning [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (22) : 31793 - 31805
- [8] MULTIMODAL SEMANTIC ATTENTION NETWORK FOR VIDEO CAPTIONING [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1300 - 1305
- [9] Video Captioning with Guidance of Multimodal Latent Topics [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1838 - 1846
- [10] Temporal Attention Feature Encoding for Video Captioning [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 1279 - 1282