共 50 条
- [1] Multimodal attention-based transformer for video captioning [J]. Applied Intelligence, 2023, 53 : 23349 - 23368
- [2] Hierarchical attention-based multimodal fusion for video captioning [J]. NEUROCOMPUTING, 2018, 315 : 362 - 370
- [4] Residual attention-based LSTM for video captioning [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (02): : 621 - 636
- [6] Attention-based Densely Connected LSTM for Video Captioning [J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 802 - 810
- [7] A Hierarchical Multimodal Attention-based Neural Network for Image Captioning [J]. SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 889 - 892
- [8] Attention-Based Multimodal Fusion for Video Description [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4203 - 4212