Flexible Visual Grounding

被引:0
|
作者
Kim, Yongmin [1 ]
Chu, Chenhui [1 ]
Kurohashi, Sadao [1 ]
机构
[1] Kyoto Univ, Kyoto, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing visual grounding datasets are artificially made, where every query regarding an entity must be able to be grounded to a corresponding image region, i.e., answerable. However, in real-world multimedia data such as news articles and social media, many entities in the text cannot be grounded to the image, i.e., unanswerable, due to the fact that the text is unnecessarily directly describing the accompanying image. A robust visual grounding model should be able to flexibly deal with both answerable and unanswerable visual grounding. To study this flexible visual grounding problem, we construct a pseudo dataset and a social media dataset including both answerable and unanswerable queries. In order to handle unanswerable visual grounding, we propose a novel method by adding a pseudo image region corresponding to a query that cannot be grounded. The model is then trained to ground to ground-truth regions for answerable queries and pseudo regions for unanswerable queries. In our experiments, we show that our model can flexibly process both answerable and unanswerable queries with high accuracy on our datasets.(1)
引用
收藏
页码:285 / 299
页数:15
相关论文
共 50 条
  • [1] VISUAL GROUNDING
    CUMBOW, RC
    AMERICAN FILM, 1978, 3 (10): : 16 - 16
  • [2] Deconfounded Visual Grounding
    Huang, Jianqiang
    Qin, Yu
    Qi, Jiaxin
    Sun, Qianru
    Zhang, Hanwang
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 998 - 1006
  • [3] Grounding Visual Explanations
    Hendricks, Lisa Anne
    Hu, Ronghang
    Darrell, Trevor
    Akata, Zeynep
    COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 : 269 - 286
  • [4] Concepts require flexible grounding
    Dove, Guy
    BRAIN AND LANGUAGE, 2023, 245
  • [5] Gaze Assisted Visual Grounding
    Johari, Kritika
    Tong, Christopher Tay Zi
    Subbaraju, Vigneshwaran
    Kim, Jung-Jae
    Tan, U-Xuan
    SOCIAL ROBOTICS, ICSR 2021, 2021, 13086 : 191 - 202
  • [6] Visual-Semantic Graph Matching for Visual Grounding
    Jing, Chenchen
    Wu, Yuwei
    Pei, Mingtao
    Hu, Yao
    Jia, Yunde
    Wu, Qi
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 4041 - 4050
  • [7] Application Study of Flexible Graphite Grounding Electrode in Typical Tower Grounding Grid
    Tang, Ke
    Ruan, Jiangjun
    Huang, Daochun
    Zhan, Qinghua
    Xiao, Wei
    Li, Hengzhen
    2016 IEEE INTERNATIONAL CONFERENCE ON HIGH VOLTAGE ENGINEERING AND APPLICATION (ICHVE), 2016,
  • [8] Composite Grounding Application of Transmission Line Tower with Flexible Graphite Grounding Material
    Liu, Hongtao
    Zhang, Lei
    Xiong, Jia
    Cui, Zhenxing
    Yang, Qi
    2ND INTERNATIONAL CONFERENCE ON DESIGN, MATERIALS, AND MANUFACTURING, 2017, 220
  • [9] Cross-Lingual Visual Grounding
    Dong, Wenjian
    Otani, Mayu
    Garcia, Noa
    Nakashima, Yuta
    Chu, Chenhui
    IEEE ACCESS, 2021, 9 : 349 - 358
  • [10] Grounding Language in Visual and Conversational Contexts
    Fernandez, Raquel
    WEB CONFERENCE 2021: COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2021), 2021, : 366 - 366