共 50 条
- [21] Decouple Before Interact: Multi-Modal Prompt Learning for Continual Visual Question Answering 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2941 - 2950
- [23] Differentiated Attention with Multi-modal Reasoning for Video Question Answering 2022 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, BIG DATA AND ALGORITHMS (EEBDA), 2022, : 525 - 530
- [25] Answer-checking in Context: A Multi-modal Fully Attention Network for Visual Question Answering 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1173 - 1180
- [26] Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1839 - 1848
- [28] NuScenes-QA: A Multi-Modal Visual Question Answering Benchmark for Autonomous Driving Scenario THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5, 2024, : 4542 - 4550
- [30] Advancing Video Question Answering with a Multi-modal and Multi-layer Question Enhancement Network PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3985 - 3993