共 41 条
- [22] Visual question answering model based on the fusion of multimodal features by a two-wav co-attention mechanism IMAGING SCIENCE JOURNAL, 2021, 69 (1-4): : 177 - 189
- [24] RVT-Transformer: Residual Attention in Answerability Prediction on Visual Question Answering for Blind People ADVANCES IN COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 1653 : 423 - 435
- [25] Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6087 - 6096
- [27] Cross-Modal Multistep Fusion Network With Co-Attention for Visual Question Answering IEEE ACCESS, 2018, 6 : 31516 - 31524
- [30] Research on visual question answering based on dynamic memory network model of multiple attention mechanisms Scientific Reports, 12