共 50 条
- [41] Multi-modality Latent Interaction Network for Visual Question Answering [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 5824 - 5834
- [42] Language-Guided Visual Aggregation Network for Video Question Answering [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5195 - 5203
- [43] Mutual Attention Inception Network for Remote Sensing Visual Question Answering [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
- [45] Auto-Parsing Network for Image Captioning and Visual Question Answering [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 2177 - 2187
- [46] A Spatial Hierarchical Reasoning Network for Remote Sensing Visual Question Answering [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
- [47] Co-attention graph convolutional network for visual question answering [J]. MULTIMEDIA SYSTEMS, 2023, 29 (05) : 2527 - 2543
- [48] More Than An Answer: Neural Pivot Network for Visual Question Answering [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 681 - 689
- [49] Cascade Reasoning Network for Text-based Visual Question Answering [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 4060 - 4069
- [50] Cross-modal Relational Reasoning Network for Visual Question Answering [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 3939 - 3948