共 50 条
- [42] Hierarchical Multi-Task Learning for Diagram Question Answering with Multi-Modal Transformer [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1313 - 1321
- [43] TASK-ORIENTED MULTI-MODAL QUESTION ANSWERING FOR COLLABORATIVE APPLICATIONS [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 1426 - 1430
- [44] MMTF: Multi-Modal Temporal Fusion for Commonsense Video Question Answering [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 4659 - 4664
- [45] Multi-Modal Knowledge-Aware Attention Network for Question Answering [J]. Xu, Changsheng (csxu@nlpr.ia.ac.cn), 1600, Science Press (57): : 1037 - 1045
- [46] Multi-modal Question Answering System Driven by Domain Knowledge Graph [J]. 5TH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING AND COMMUNICATIONS (BIGCOM 2019), 2019, : 43 - 47
- [49] RAMM: Retrieval-augmented Biomedical Visual Question Answering with Multi-modal Pre-training [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 547 - 556
- [50] Pre-Training Multi-Modal Dense Retrievers for Outside-Knowledge Visual Question Answering [J]. PROCEEDINGS OF THE 2023 ACM SIGIR INTERNATIONAL CONFERENCE ON THE THEORY OF INFORMATION RETRIEVAL, ICTIR 2023, 2023, : 169 - 176