共 50 条
- [5] Multi-Modal Fusion Transformer for Visual Question Answering in Remote Sensing IMAGE AND SIGNAL PROCESSING FOR REMOTE SENSING XXVIII, 2022, 12267
- [9] Gated Multi-modal Fusion with Cross-modal Contrastive Learning for Video Question Answering ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VII, 2023, 14260 : 427 - 438
- [10] MoQA - A Multi-modal Question Answering Architecture COMPUTER VISION - ECCV 2018 WORKSHOPS, PT IV, 2019, 11132 : 106 - 113