共 50 条
- [2] MULTIMODAL SEMANTIC ATTENTION NETWORK FOR VIDEO CAPTIONING [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1300 - 1305
- [3] Hierarchical attention-based multimodal fusion for video captioning [J]. NEUROCOMPUTING, 2018, 315 : 362 - 370
- [5] A Hierarchical Multimodal Attention-based Neural Network for Image Captioning [J]. SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 889 - 892
- [7] Hierarchical Attention Network for Image Captioning [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 8957 - 8964
- [8] Hierarchical Modular Network for Video Captioning [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17918 - 17927
- [9] Multimodal attention-based transformer for video captioning [J]. Applied Intelligence, 2023, 53 : 23349 - 23368
- [10] Multimodal attention-based transformer for video captioning [J]. APPLIED INTELLIGENCE, 2023, 53 (20) : 23349 - 23368