Answer-Type Prediction for Visual Question Answering

被引:65
|
作者
Kafle, Kushal [1 ]
Kanan, Christopher [1 ]
机构
[1] Rochester Inst Technol, Chester F Carlson Ctr Imaging Sci, Rochester, NY 14623 USA
关键词
D O I
10.1109/CVPR.2016.538
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, algorithms for object recognition and related tasks have become sufficiently proficient that new vision tasks can now be pursued. In this paper, we build a system capable of answering open-ended text-based questions about images, which is known as Visual Question Answering (VQA). Our approach's key insight is that we can predict the form of the answer from the question. We formulate our solution in a Bayesian framework. When our approach is combined with a discriminative model, the combined model achieves state-of-the-art results on four benchmark datasets for open-ended VQA: DAQUAR, COCO-QA, The VQA Dataset, and Visual7W.
引用
收藏
页码:4976 / 4984
页数:9
相关论文
共 50 条
  • [41] Question and Answer Classification in Czech Question Answering Benchmark Dataset
    Kusnirakova, Dasa
    Medved, Marek
    Horak, Ales
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 701 - 706
  • [42] Answer Diversification for Complex Question Answering on the Web
    Achananuparp, Palakorn
    Hu, Xiaohua
    He, Tingting
    Yang, Christopher C.
    An, Yuan
    Guo, Lifan
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT I, PROCEEDINGS, 2010, 6118 : 375 - +
  • [43] ANSWERING THE SHORT-ANSWER QUESTION PAPER
    HOLDEN, NL
    BRITISH JOURNAL OF HOSPITAL MEDICINE, 1994, 51 (1-2): : 44 - 46
  • [44] Collaborative Learning for Answer Selection in Question Answering
    Shao, Taihua
    Kui, Xiaoyan
    Zhang, Pengfei
    Chen, Honghui
    IEEE ACCESS, 2019, 7 : 7337 - 7347
  • [45] On Answer Position Bias in Transformers for Question Answering
    Glater, Rafael
    Santos, Rodrygo L. T.
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 2215 - 2219
  • [46] Answer Retrieval in Legal Community Question Answering
    Askari, Arian
    Yang, Zihui
    Ren, Zhaochun
    Verberne, Suzan
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT III, 2024, 14610 : 477 - 485
  • [47] VQA: Visual Question Answering
    Antol, Stanislaw
    Agrawal, Aishwarya
    Lu, Jiasen
    Mitchell, Margaret
    Batra, Dhruv
    Zitnick, C. Lawrence
    Parikh, Devi
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2425 - 2433
  • [48] Indic Visual Question Answering
    Chandrasekar, Aditya
    Shimpi, Amey
    Naik, Dinesh
    2022 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS, SPCOM, 2022,
  • [49] VQA: Visual Question Answering
    Agrawal, Aishwarya
    Lu, Jiasen
    Antol, Stanislaw
    Mitchell, Margaret
    Zitnick, C. Lawrence
    Parikh, Devi
    Batra, Dhruv
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2017, 123 (01) : 4 - 31
  • [50] Survey on Visual Question Answering
    Bao X.-G.
    Zhou C.-L.
    Xiao K.-J.
    Qin B.
    Ruan Jian Xue Bao/Journal of Software, 2021, 32 (08): : 2522 - 2544