The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning

被引:13
|
作者
Hessel, Jack [1 ]
Hwang, Jena D. [1 ]
Park, Jae Sung [2 ]
Zellers, Rowan [2 ]
Bhagavatula, Chandra [1 ]
Rohrbach, Anna [3 ]
Saenko, Kate [4 ,5 ]
Choi, Yejin [1 ,2 ]
机构
[1] Allen Inst AI, Seattle, WA 98103 USA
[2] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA USA
[3] Univ Calif Berkeley, Berkeley, CA USA
[4] Boston Univ, Boston, MA USA
[5] MIT IBM Watson AI, Cambridge, MA USA
来源
关键词
D O I
10.1007/978-3-031-20059-5_32
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Humans have remarkable capacity to reason abductively and hypothesize about what lies beyond the literal content of an image. By identifying concrete visual clues scattered throughout a scene, we almost can't help but draw probable inferences beyond the literal scene based on our everyday experience and knowledge about theworld. For example, ifwe see a "20 mph" sign alongside a road, we might assume the street sits in a residential area (rather than on a highway), even if no houses are pictured. Can machines perform similar visual reasoning? We present Sherlock, an annotated corpus of 103K images for testing machine capacity for abductive reasoning beyond literal image contents. We adopt a free-viewing paradigm: participants first observe and identify salient clueswithin images (e.g., objects, actions) and then provide a plausible inference about the scene, given the clue. In total, we collect 363K (clue, inference) pairs, which form a first-of-its-kind abductive visual reasoning dataset. Using our corpus, we test three complementary axes of abductive reasoning. We evaluate the capacity ofmodels to: i) retrieve relevant inferences from a large candidate corpus; ii) localize evidence for inferences via boundingboxes, and iii) compare plausible inferences tomatchhumanjudgmentsonanewly-collecteddiagnosticcorpusof19KLikert-scalejudgments. While we find that fine-tuning CLIP-RN50 x 64 with amultitask objective outperforms strong baselines, significant headroom exists between model performance and human agreement. Data, models, and leaderboard available at http://visualabduction.com/.
引用
收藏
页码:558 / 575
页数:18
相关论文
共 50 条
  • [1] Sherlock Holmes and the Rules of Scientific Reasoning
    Schurr, Juergen
    IEEE INSTRUMENTATION & MEASUREMENT MAGAZINE, 2016, 19 (04) : 6 - 9
  • [2] VideoABC: A Real-World Video Dataset for Abductive Visual Reasoning
    Zhao, Wenliang
    Rao, Yongming
    Tang, Yansong
    Zhou, Jie
    Lu, Jiwen
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6048 - 6061
  • [3] ARE THE CONCLUSIONS OF SHERLOCK-HOLMES OR DUPIN OF ABDUCTIVE NATURE - THE INCORRECT DEFINITION OF ABDUCTION IN SEMIOTIC ANALYSIS APPLIED TO DETECTIVE-STORIES
    REICHERTZ, J
    KODIKAS CODE-ARS SEMEIOTICA, 1990, 13 (3-4): : 307 - 324
  • [4] From Visual Abduction to Abductive Vision
    Park, Woosuk
    PHILOSOPHY AND COGNITIVE SCIENCE II: WESTERN & EASTERN STUDIES, 2015, 20 : 141 - 153
  • [5] Visual Abductive Reasoning
    Liang, Chen
    Wang, Wenguan
    Zhou, Tianfei
    Yang, Yi
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15544 - 15554
  • [6] "Brainy Is the New Sexy": Sherlock Holmes, Abduction, and Neural Networks
    Petrovsky, Helen, V
    FILOSOFSKII ZHURNAL, 2019, 12 (01): : 74 - 89
  • [7] Visual abductive reasoning in archaeology
    Shelley, C
    PHILOSOPHY OF SCIENCE, 1996, 63 (02) : 278 - 301
  • [8] Clinical reasoning: Sherlock Holmes or Dr Joseph Bell
    Berezutsky, Volodymyr
    MEDICAL TEACHER, 2023, 45 (01) : 114 - 114
  • [9] Sherlock Holmes and the rules of scientific reasoning [Basic Metrology]
    Schurr J.
    IEEE Instrumentation and Measurement Magazine, 2016, 19 (04): : 7 - 9and14
  • [10] CONSIDERATION ON REICHERTZ,JO ESSAY QUESTIONING SHERLOCK-HOLMES ABDUCTIVE LOGIC
    ROHR, S
    KODIKAS CODE-ARS SEMEIOTICA, 1990, 13 (3-4): : 325 - 328