共 50 条
- [41] MedFuseNet: An attention-based multimodal deep learning model for visual question answering in the medical domain [J]. Scientific Reports, 11
- [45] Multimodal fusion: advancing medical visual question-answering [J]. Neural Computing and Applications, 2024, 36 (33) : 20949 - 20962
- [46] Multimodal Local Perception Bilinear Pooling for Visual Question Answering [J]. IEEE ACCESS, 2018, 6 : 57923 - 57932
- [47] Dual-Key Multimodal Backdoors for Visual Question Answering [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15354 - 15364
- [48] From Pixels to Objects: Cubic Visual Attention for Visual Question Answering [J]. PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 906 - 912
- [49] Contrastive training of a multimodal encoder for medical visual question answering [J]. INTELLIGENT SYSTEMS WITH APPLICATIONS, 2023, 18
- [50] Multimodal Graph Networks for Compositional Generalization in Visual Question Answering [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33