CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations

被引:8
|
作者
Salewski, Leonard [1 ]
Koepke, A. Sophia [1 ]
Lensch, Hendrik P. A. [1 ]
Akata, Zeynep [1 ,2 ,3 ]
机构
[1] Univ Tubingen, Tubingen, Germany
[2] MPI Informat, Saarbrucken, Germany
[3] MPI Intelligent Syst, Tubingen, Germany
关键词
Visual question answering; Natural language explanations;
D O I
10.1007/978-3-031-04083-2_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Providing explanations in the context of Visual Question Answering (VQA) presents a fundamental problem in machine learning. To obtain detailed insights into the process of generating natural language explanations for VQA, we introduce the large-scale CLEVR-X dataset that extends the CLEVR dataset with natural language explanations. For each image-question pair in the CLEVR dataset, CLEVR-X contains multiple structured textual explanations which are derived from the original scene graphs. By construction, the CLEVR-X explanations are correct and describe the reasoning and visual information that is necessary to answer a given question. We conducted a user study to confirm that the ground-truth explanations in our proposed dataset are indeed complete and relevant. We present baseline results for generating natural language explanations in the context of VQA using two state-of-the-art frameworks on the CLEVR-X dataset. Furthermore, we provide a detailed analysis of the explanation generation quality for different question and answer types. Additionally, we study the influence of using different numbers of ground-truth explanations on the convergence of natural language generation (NLG) metrics. The CLEVR-X dataset is publicly available at https://github.com/ExplainableML/CLEVR-X.
引用
收藏
页码:69 / 88
页数:20
相关论文
共 50 条
  • [31] e-SNLI: Natural Language Inference with Natural Language Explanations
    Camburu, Oana-Maria
    Rocktaschel, Tim
    Lukasiewicz, Thomas
    Blunsom, Phil
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [32] Automatic Generation of Natural Language Explanations
    Costa, Felipe
    Ouyang, Sixun
    Dolog, Peter
    Lawlor, Aonghus
    COMPANION OF THE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES (IUI'18), 2018,
  • [33] Modeling students' natural language explanations
    Corbett, Albert
    Wagner, Angela
    Lesgold, Sharon
    Ulrich, Harry
    Stevens, Scott
    USER MODELING 2007, PROCEEDINGS, 2007, 4511 : 117 - +
  • [34] Counterfactual Explanations for Natural Language Interfaces
    Tolkachev, George
    Mell, Stephen
    Zdancewic, Steve
    Bastani, Osbert
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): (SHORT PAPERS), VOL 2, 2022, : 113 - 118
  • [35] Natural Language Explanations for Query Results
    Deutch, Daniel
    Frost, Nave
    Gilad, Amir
    SIGMOD RECORD, 2018, 47 (01) : 42 - 49
  • [36] CLEVR_HYP: A Challenge Dataset and Baselines for Visual Question Answering with Hypothetical Actions over Images
    Sampat, Shailaja Keyur
    Kumar, Akshay
    Yang, Yezhou
    Baral, Chitta
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3692 - 3709
  • [37] Natural Language Reasoning, A Survey
    Yu, Fei
    Zhang, Hongbo
    Tiwari, Prayag
    Wang, Benyou
    ACM COMPUTING SURVEYS, 2024, 56 (12)
  • [38] Probabilistic reasoning and natural language
    Macchi, Laura
    Bagassi, Maria
    BIOLOGICAL AND CULTURAL BASES OF HUMAN INFERENCE, 2006, : 223 - 239
  • [39] Influence of Natural Language on Reasoning
    Skelac, Ines
    Smokrovic, Nenad
    FILOZOFSKA ISTRAZIVANJA, 2017, 37 (04): : 709 - 722
  • [40] The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning
    Hessel, Jack
    Hwang, Jena D.
    Park, Jae Sung
    Zellers, Rowan
    Bhagavatula, Chandra
    Rohrbach, Anna
    Saenko, Kate
    Choi, Yejin
    COMPUTER VISION, ECCV 2022, PT XXXVI, 2022, 13696 : 558 - 575