共 50 条
- [1] Reasoning with Heterogeneous Graph Alignment for Video Question Answering [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11109 - 11116
- [2] Dynamic Multistep Reasoning based on Video Scene Graph for Video Question Answering [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3894 - 3904
- [3] Discovering the Real Association: Multimodal Causal Reasoning in Video Question Answering [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19027 - 19036
- [5] Multimodal Graph Transformer for Multimodal Question Answering [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 189 - 200
- [8] Visual Question Answering on CLEVR Dataset via Multimodal Fusion and Relational Reasoning [J]. 2021 52ND ANNUAL IRANIAN MATHEMATICS CONFERENCE (AIMC), 2021, : 74 - 76
- [9] Video Graph Transformer for Video Question Answering [J]. COMPUTER VISION, ECCV 2022, PT XXXVI, 2022, 13696 : 39 - 58