共 50 条
- [21] Learning Trimodal Relation for Audio-Visual Question Answering with Missing Modality COMPUTER VISION - ECCV 2024, PT XV, 2025, 15073 : 42 - 59
- [22] Progressive Spatio-temporal Perception for Audio-Visual Question Answering PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7808 - 7816
- [23] Object-Difference Attention: A Simple Relational Attention for Visual Question Answering PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 519 - 527
- [24] Multi-Channel Co-Attention Network for Visual Question Answering 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
- [25] Efficient Multi-step Reasoning Attention Network for Visual Question Answering THIRTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2021), 2022, 12083
- [27] Pano-AVQA: Grounded Audio-Visual Question Answering on 360° Videos 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 2011 - 2021
- [29] ADAPTIVE ATTENTION FUSION NETWORK FOR VISUAL QUESTION ANSWERING 2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 997 - 1002