共 50 条
- [23] Multi-Modal Image Captioning for the Visually Impaired 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 53 - 60
- [26] Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6264 - 6273
- [27] Multi-level Fusion of Multi-modal Semantic Embeddings for Zero Shot Learning PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2022, 2022, : 310 - 318
- [29] MAM-RNN: Multi-level Attention Model Based RNN for Video Captioning PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2208 - 2214
- [30] Multi-modal Video Summarization ICMR 2024 - Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024, : 1214 - 1218