Multimodal Graph Transformer for Multimodal Question Answering

被引:0
|
作者
He, Xuehai [1 ]
Wang, Xin Eric [1 ]
机构
[1] UC Santa Cruz, United States
来源
arXiv | 2023年
关键词
Compendex;
D O I
暂无
中图分类号
学科分类号
摘要
Semantics
引用
收藏
相关论文
共 50 条
  • [1] Multimodal Graph Transformer for Multimodal Question Answering
    He, Xuehai
    Wang, Xin Eric
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 189 - 200
  • [2] Multimodal Graph Transformer for Multimodal Question Answering
    He, Xuehai
    Wang, Xin Eric
    EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference, 2023, : 189 - 200
  • [3] Multimodal Graph Reasoning and Fusion for Video Question Answering
    Zhang, Shuai
    Wang, Xingfu
    Hawbani, Ammar
    Zhao, Liang
    Alsamhi, Saeed Hamood
    2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, 2022, : 1410 - 1415
  • [4] Multimodal Graph Networks for Compositional Generalization in Visual Question Answering
    Saqur, Raeid
    Narasimhan, Karthik
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [5] MIMOQA: Multimodal Input Multimodal Output Question Answering
    Singh, Hrituraj
    Nasery, Anshul
    Mehta, Denil
    Agarwal, Aishwarya
    Lamba, Jatin
    Srinivasan, Balaji Vasan
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5317 - 5332
  • [6] Efficient End-to-End Video Question Answering with Pyramidal Multimodal Transformer
    Peng, Min
    Wang, Chongyang
    Shi, Yu
    Zhou, Xiang-Dong
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 2038 - 2046
  • [7] Multimodal Attention for Visual Question Answering
    Kodra, Lorena
    Mece, Elinda Kajo
    INTELLIGENT COMPUTING, VOL 1, 2019, 858 : 783 - 792
  • [8] MMFT-BERT: Multimodal Fusion Transformer with BERT Encodings for Visual Question Answering
    Khan, Aisha Urooj
    Mazaheri, Amir
    Lobo, Niels Da Vitoria
    Shah, Mubarak
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4648 - 4660
  • [9] Multimodal deep fusion for image question answering
    Zhang, Weifeng
    Yu, Jing
    Wang, Yuxia
    Wang, Wei
    KNOWLEDGE-BASED SYSTEMS, 2021, 212
  • [10] Multimodal Learning and Reasoning for Visual Question Answering
    Ilievski, Ilija
    Feng, Jiashi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30