共 50 条
- [11] DynGraph: Visual Question Answering via Dynamic Scene Graphs PATTERN RECOGNITION, DAGM GCPR 2019, 2019, 11824 : 428 - 441
- [12] Variational Causal Inference Network for Explanatory Visual Question Answering 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2515 - 2525
- [13] Large Language Models are Temporal and Causal Reasoners for Video Question Answering 2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 4300 - 4316
- [14] Discovering the Real Association: Multimodal Causal Reasoning in Video Question Answering 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19027 - 19036
- [15] Coarse-to-Fine Visual Question Answering by Iterative, Conditional Refinement IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT II, 2022, 13232 : 418 - 428
- [17] An Empirical Study of Multilingual Scene-Text Visual Question Answering PROCEEDINGS OF THE 2ND WORKSHOP ON USER-CENTRIC NARRATIVE SUMMARIZATION OF LONG VIDEOS, NARSUM 2023, 2023, : 3 - 8
- [18] Improving visual question answering by combining scene-text information Multimedia Tools and Applications, 2022, 81 : 12177 - 12208
- [19] Towards Video Text Visual Question Answering: Benchmark and Baseline ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,