共 50 条
- [41] Multimodal attention-based transformer for video captioning [J]. Applied Intelligence, 2023, 53 : 23349 - 23368
- [42] REFINING ATTENTION: A SEQUENTIAL ATTENTION MODEL FOR IMAGE CAPTIONING [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
- [43] Boosted Attention: Leveraging Human Attention for Image Captioning [J]. COMPUTER VISION - ECCV 2018, PT XI, 2018, 11215 : 72 - 88
- [45] MIXED KNOWLEDGE RELATION TRANSFORMER FOR IMAGE CAPTIONING [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4403 - 4407
- [46] Multimodal attention-based transformer for video captioning [J]. APPLIED INTELLIGENCE, 2023, 53 (20) : 23349 - 23368
- [47] A Position-Aware Transformer for Image Captioning [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 70 (01): : 2065 - 2081
- [50] Retrieval-Augmented Transformer for Image Captioning [J]. 19TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING, CBMI 2022, 2022, : 1 - 7