共 50 条
- [2] MAPS: Joint Multimodal Attention and POS Sequence Generation for Video Captioning [J]. 2021 INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2021,
- [3] Multimodal Deep Neural Network with Image Sequence Features for Video Captioning [J]. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
- [4] Controllable Video Captioning with POS Sequence Guidance Based on Gated Fusion Network [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2641 - 2650
- [5] Sequence to Sequence - Video to Text [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4534 - 4542
- [6] Video sequence matching [J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 3697 - 3700
- [7] Self-critical Sequence Training for Image Captioning [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1179 - 1195
- [8] Sequence-to-Sequence Video Prediction by Learning Hierarchical Representations [J]. APPLIED SCIENCES-BASEL, 2020, 10 (22): : 1 - 14
- [9] TRIPLE SEQUENCE GENERATIVE ADVERSARIAL NETS FOR UNSUPERVISED IMAGE CAPTIONING [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7598 - 7602