共 50 条
- [32] Gated Multi-modal Fusion with Cross-modal Contrastive Learning for Video Question Answering [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VII, 2023, 14260 : 427 - 438
- [33] Sequential Visual Reasoning for Visual Question Answering [J]. PROCEEDINGS OF 2018 5TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2018, : 410 - 415
- [35] Chain of Reasoning for Visual Question Answering [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
- [37] A Spatial Hierarchical Reasoning Network for Remote Sensing Visual Question Answering [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
- [38] Cascade Reasoning Network for Text-based Visual Question Answering [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 4060 - 4069
- [40] Cross-Modal Self-Attention with Multi-Task Pre-Training for Medical Visual Question Answering [J]. PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 456 - 460