共 50 条
- [31] Rethinking Data Augmentation for Robust Visual Question Answering COMPUTER VISION, ECCV 2022, PT XXXVI, 2022, 13696 : 95 - 112
- [32] Multimodal Local Perception Bilinear Pooling for Visual Question Answering IEEE ACCESS, 2018, 6 : 57923 - 57932
- [34] Greedy Gradient Ensemble for Robust Visual Question Answering 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1564 - 1573
- [35] Dual-Key Multimodal Backdoors for Visual Question Answering 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15354 - 15364
- [36] FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 1330 - 1350
- [38] Contrastive training of a multimodal encoder for medical visual question answering INTELLIGENT SYSTEMS WITH APPLICATIONS, 2023, 18
- [39] Multimodal Graph Networks for Compositional Generalization in Visual Question Answering ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
- [40] Towards Building a Robust Knowledge Intensive Question Answering Model with Large Language Models NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT I, NLPCC 2024, 2025, 15359 : 228 - 242