A dataset to answer visual questions about named entities

被引:0
|
作者
Lerner, Paul [1 ]
Messoud, Salem [1 ]
Ferret, Olivier [2 ]
Guinaudeau, Camille [1 ]
Le Borgne, Herve [2 ]
Besancon, Romaric [2 ]
Moreno, Jose G. [3 ]
Melgarejo, Jesus Lovon [3 ]
机构
[1] Univ Paris Saclay, CNRS, LISN, F-91400 Orsay, France
[2] Univ Paris Saclay, CEA, List, F-91120 Palaiseau, France
[3] Univ Paul Sabatier, IRIT, UMR 5505, CNRS, Toulouse, France
来源
TRAITEMENT AUTOMATIQUE DES LANGUES | 2022年 / 63卷 / 02期
关键词
Dataset; Knowledge-based Visual Question Answering; Multimodality;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
In the context of multimodal processing, we focus our work on Knowledge-based Visual Question Answering about named Entities (KVQAE). We provide ViQuAE, a novel dataset of 3,700 questions paired with images, annotated using a semi-automatic method. It is the first KVQAE dataset to cover a wide range of entity types, associated with a knowledge base composed of 1.5M Wikipedia articles paired with images. To set a baseline on the benchmark, we address KVQAE as a three-stage problem: initial Information Retrieval, Re-Ranking, and Reading Comprehension. The experiments empirically demonstrate the difficulty of the task and pave the way towards better multimodal entity representations.
引用
收藏
页码:15 / 39
页数:25
相关论文
共 50 条
  • [1] ViQuAE, a Dataset for Knowledge-based Visual Question Answering about Named Entities
    Lerner, Paul
    Ferret, Olivier
    Guinaudeau, Camille
    Le Borgne, Herve
    Besancon, Romaric
    Moreno, Jose G.
    Melgarejo, Jesus Lovon
    [J]. PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 3108 - 3120
  • [2] Annotating Named Entities in Consumer Health Questions
    Kilicoglu, Halil
    Ben Abacha, Asma
    Mrabet, Yassine
    Roberts, Kirk
    Rodriguez, Laritza
    Shooshan, Sonya E.
    Demner-Fushman, Dina
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3325 - 3332
  • [3] A Chinese telemedicine-dialogue dataset annotated for named entities
    Shanshan Wang
    Yajing Yan
    Rong Yan
    Ting Li
    Kaijie Ma
    Yani Yan
    [J]. BMC Medical Informatics and Decision Making, 23
  • [4] A Chinese telemedicine-dialogue dataset annotated for named entities
    Wang, Shanshan
    Yan, Yajing
    Yan, Rong
    Li, Ting
    Ma, Kaijie
    Yan, Yani
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2023, 23 (01)
  • [5] A dataset of clinically generated visual questions and answers about radiology images
    Jason J. Lau
    Soumya Gayen
    Asma Ben Abacha
    Dina Demner-Fushman
    [J]. Scientific Data, 5
  • [6] A dataset of clinically generated visual questions and answers about radiology images
    Lau, Jason J.
    Gayen, Soumya
    Ben Abacha, Asma
    Demner-Fushman, Dina
    [J]. SCIENTIFIC DATA, 2018, 5
  • [7] Explicit Knowledge Integration for Knowledge-Aware Visual Question Answering about Named Entities
    Adjali, Omar
    Grimal, Paul
    Ferret, Olivier
    Ghannay, Sahar
    Le Borgne, Herve
    [J]. PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 29 - 38
  • [8] MilkQA: a Dataset of Consumer Questions for the Task of Answer Selection
    Criscuolo, Marcelo
    Fonseca, Erick Rocha
    Aluisio, Sandra Maria
    Speranca-Criscuolo, Ana Carolina
    [J]. 2017 6TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 2017, : 354 - 359
  • [9] AN ANSWER TO SOME QUESTIONS ABOUT THE RADANIYA
    JACOBI, J
    [J]. ISLAM-ZEITSCHRIFT FUR GESCHICHTE UND KULTUR DES ISLAMISCHEN ORIENTS, 1975, 52 (02): : 226 - 238
  • [10] NEREL-BIO: a dataset of biomedical abstracts annotated with nested named entities
    Loukachevitch, Natalia
    Manandhar, Suresh
    Baral, Elina
    Rozhkov, Igor
    Braslavski, Pavel
    Ivanov, Vladimir
    Batura, Tatiana
    Tutubalina, Elena
    [J]. BIOINFORMATICS, 2023, 39 (04)