Answer-Type Prediction for Visual Question Answering

被引：65

作者：

Kafle, Kushal ^{[1
]}

Kanan, Christopher ^{[1
]}

机构：

[1] Rochester Inst Technol, Chester F Carlson Ctr Imaging Sci, Rochester, NY 14623 USA

来源：

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2016年

关键词：

D O I：

10.1109/CVPR.2016.538

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, algorithms for object recognition and related tasks have become sufficiently proficient that new vision tasks can now be pursued. In this paper, we build a system capable of answering open-ended text-based questions about images, which is known as Visual Question Answering (VQA). Our approach's key insight is that we can predict the form of the answer from the question. We formulate our solution in a Bayesian framework. When our approach is combined with a discriminative model, the combined model achieves state-of-the-art results on four benchmark datasets for open-ended VQA: DAQUAR, COCO-QA, The VQA Dataset, and Visual7W.

引用

页码：4976 / 4984

页数：9

共 50 条

[21] Question Answering Based on Answer Trustworthiness
Oh, Hyo-Jung
Lee, Chung-Hee
Yoon, Yeo-Chan
Jang, Myung-Gil
INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2009, 5839 : 310 - 317
[22] Answer formulation for question-answering
Kosseim, L
Plamondon, L
Guillemette, LJ
ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2003, 2671 : 24 - 34
[23] Transformer-based Sparse Encoder and Answer Decoder for Visual Question Answering
Peng, Longkun
An, Gaoyun
Ruan, Qiuqi
2022 16TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP2022), VOL 1, 2022, : 120 - 123
[24] Visual Question Answering
Nada, Ahmed
Chen, Min
2024 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS, ICNC, 2024, : 6 - 10
[25] Dual-decoder transformer network for answer grounding in visual question answering
Zhu, Liangjun
Peng, Li
Zhou, Weinan
Yang, Jielong
PATTERN RECOGNITION LETTERS, 2023, 171 : 53 - 60
[26] Answer-Based Entity Extraction and Alignment for Visual Text Question Answering
Yu, Jun
Jing, Mohan
Liu, Weihao
Luo, Tongxu
Zhang, Bingyuan
Lu, Keda
Lei, Fangyu
Sun, Jianqing
Liang, Jiaen
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 9487 - 9491
[27] TYPE-AWARE MEDICAL VISUAL QUESTION ANSWERING
Zhang, Anda
Tao, Wei
Li, Ziyan
Wang, Haofen
Zhang, Wenqiang
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4838 - 4842
[28] Question recommendation and answer extraction in question answering community
Xianfeng, Yang
Pengfei, Liu
International Journal of Database Theory and Application, 2016, 9 (01): : 35 - 44
[29] Social Question Answering: Textual, User, and Network Features for Best Answer Prediction
Molino, Piero
Aiello, Luca Maria
Lops, Pasquale
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2016, 35 (01)
[30] Question Modifiers in Visual Question Answering
Britton, William
Sarkhel, Somdeb
Venugopal, Deepak
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1472 - 1479

← 1 2 3 4 5 →