Context-VQA: Towards Context-Aware and Purposeful Visual Question Answering

被引:0
|
作者
Naik, Nandita [1 ]
Potts, Christopher [1 ]
Kreiss, Elisa [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
关键词
D O I
10.1109/ICCVW60793.2023.00301
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual question answering (VQA) has the potential to make the Internet more accessible in an interactive way, allowing people who cannot see images to ask questions about them. However, multiple studies have shown that people who are blind or have low-vision prefer image explanations that incorporate the context in which an image appears, yet current VQA datasets focus on images in isolation. We argue that VQA models will not fully succeed at meeting people's needs unless they take context into account. To further motivate and analyze the distinction between different contexts, we introduce Context-VQA(1), a VQA dataset that pairs images with contexts, specifically types of websites (e.g., a shopping website). We find that the types of questions vary systematically across contexts. For example, images presented in a travel context garner 2 times more "Where?" questions, and images on social media and news garner 2.8 and 1.8 times more "Who?" questions than the average. We also find that context effects are especially important when participants can't see the image. These results demonstrate that context affects the types of questions asked and that VQA models should be contextsensitive to better meet people's needs, especially in accessibility settings.
引用
收藏
页码:2813 / 2817
页数:5
相关论文
共 50 条
  • [41] Towards context-aware composition of web services
    Luo, Nan
    Yan, Junwei
    Liu, Min
    Yang, Shuxin
    [J]. GCC 2005: FIFTH INTERNATIONAL CONFERENCE ON GRID AND COOPERATIVE COMPUTING, PROCEEDINGS, 2006, : 494 - +
  • [42] Towards programmable context-aware voice services
    Jean, K
    Vardalachos, N
    Galis, A
    [J]. INTELLIGENCE IN COMMUNICATION SYSTEMS, 2005, 190 : 231 - 247
  • [43] Towards latent context-aware recommendation systems
    Unger, Moshe
    Bar, Ariel
    Shapira, Bracha
    Rokach, Lior
    [J]. KNOWLEDGE-BASED SYSTEMS, 2016, 104 : 165 - 178
  • [44] Towards a programming model for context-aware applications
    Barbosa, Jorge
    Dillenburg, Fabiane
    Lerrnen, Gustavo
    Garzao, Alex
    Costa, Cristiano
    Rosa, Joao
    [J]. COMPUTER LANGUAGES SYSTEMS & STRUCTURES, 2012, 38 (03) : 199 - 213
  • [45] Towards Context-Aware Evaluation for Image Search
    Shao, Yunqiu
    Mao, Jiaxin
    Liu, Yiqun
    Zhang, Min
    Ma, Shaoping
    [J]. PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, : 1209 - 1212
  • [46] Towards Context-Aware Search with Right Click
    Sun, Aixin
    Lou, Chii-Hian
    [J]. SIGIR'14: PROCEEDINGS OF THE 37TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2014, : 847 - 850
  • [47] Towards Context-Aware Social Behavioral Analytics
    Beheshti, Amin
    Hashemi, Vahid Moraveji
    Yakhchi, Shahpar
    [J]. 17TH INTERNATIONAL CONFERENCE ON ADVANCES IN MOBILE COMPUTING & MULTIMEDIA (MOMM2019), 2019, : 28 - 35
  • [48] Towards a Better Understanding of Context-Aware Applications
    Pascalau, Emilian
    Nalepa, Grzegorz J.
    Kluza, Krzysztof
    [J]. 2013 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2013, : 959 - 962
  • [49] Towards Context-Aware Mobile Web Browsing
    Zhu Wang
    Zhiwen Yu
    Xingshe Zhou
    Chao Chen
    Bin Guo
    [J]. Wireless Personal Communications, 2016, 91 : 187 - 203
  • [50] Temporal Context-Aware Representation Learning for Question Routing
    Zhang, Xuchao
    Cheng, Wei
    Zong, Bo
    Chen, Yuncong
    Xu, Jianwu
    Li, Ding
    Chen, Haifeng
    [J]. PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM '20), 2020, : 753 - 761