共 50 条
- [1] End-to-End Dense Video Captioning with Parallel Decoding [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6827 - 6837
- [2] End-to-End Video Captioning [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1474 - 1482
- [3] Accelerated masked transformer for dense video captioning [J]. NEUROCOMPUTING, 2021, 445 : 72 - 80
- [6] Video Caption Based Searching Using End-to-End Dense Captioning and Sentence Embeddings [J]. SYMMETRY-BASEL, 2020, 12 (06):
- [7] End-to-End Transformer Based Model for Image Captioning [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 2585 - 2594
- [8] End-to-end Generative Pretraining for Multimodal Video Captioning [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17938 - 17947
- [9] End-to-End Video Captioning with Multitask Reinforcement Learning [J]. 2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 339 - 348
- [10] End-to-End Video Text Spotting with Transformer [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (09) : 4019 - 4035