OVQA: A Clinically Generated Visual Question Answering Dataset

被引:4
|
作者
Huang, Yefan [1 ]
Wang, Xiaoli [1 ]
Liu, Feiyan [1 ]
Huang, Guofeng [2 ]
机构
[1] Xiamen Univ, Sch Informat, Xiamen, Peoples R China
[2] Xiamen Univ, Affiliated Southeast Hosp, Xiamen, Peoples R China
基金
中国国家自然科学基金;
关键词
Medical visual question answering; Benchmarking dataset; Semiautomatic data generation;
D O I
10.1145/3477495.3531724
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Medical visual question answering (Med-VQA) is a challenging problem that aims to take a medical image and a clinical question about the image as input and output a correct answer in natural language. Current medical systems often require large-scale and high-quality labeled data for training and evaluation. To address the challenge, we present a new dataset, denoted by OVQA, which is generated from electronic medical records. We develop a semi-automatic data generation tool for constructing the dataset. First, medical entities are automatically extracted from medical records and filled into predefined templates for generating question and answer pairs. These pairs are then combined with medical images extracted from corresponding medical records, to generate candidates for visual question answering (VQA). The candidates are finally verified with high-quality labels annotated by experienced physicians. To evaluate the quality of OVQA, we conduct comprehensive experiments on state-of-the-art methods for the Med-VQA task to our dataset. The results show that our OVQA can be used as a benchmarking dataset for evaluating existing Med-VQA systems. The dataset can be downloaded from http://47.94.174.82/.
引用
收藏
页码:2924 / 2938
页数:15
相关论文
共 50 条
  • [1] Dataset bias: A case study for visual question answering
    Das, Anubrata
    Anjum, Samreen
    Gurari, Danna
    [J]. Proceedings of the Association for Information Science and Technology, 2019, 56 (01): : 58 - 67
  • [2] Improvisation of Dataset Efficiency in Visual Question Answering Domain
    Mohamed, Sheerin Sitara Noor
    Srinivasan, Kavitha
    [J]. STATISTICS AND APPLICATIONS, 2022, 20 (02): : 279 - 289
  • [3] A Large Visual Question Answering Dataset for Cultural Heritage
    Asprino, Luigi
    Bulla, Luana
    Marinucci, Ludovica
    Mongiovi, Misael
    Presutti, Valentina
    [J]. MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE (LOD 2021), PT II, 2022, 13164 : 193 - 197
  • [4] Cross-Dataset Adaptation for Visual Question Answering
    Chao, Wei-Lun
    Hu, Hexiang
    Sha, Fei
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5716 - 5725
  • [5] AVQA: A Dataset for Audio-Visual Question Answering on Videos
    Yang, Pinci
    Wang, Xin
    Duan, Xuguang
    Chen, Hong
    Hou, Runze
    Jin, Cong
    Zhu, Wenwu
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3480 - 3491
  • [6] LoRA: A Logical Reasoning Augmented Dataset for Visual Question Answering
    Gao, Jingying
    Wu, Qi
    Blair, Alan
    Pagnucco, Maurice
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [7] CIRCUITVQA: A Visual Question Answering Dataset for Electrical Circuit Images
    Mehta, Rahul
    Singh, Bhavyajeet
    Varma, Vasudeva
    Gupta, Manish
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT I, ECML PKDD 2024, 2024, 14941 : 440 - 460
  • [8] Improving Visual Question Answering by Referring to Generated Paragraph Captions
    Kim, Hyounghun
    Bansal, Mohit
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3606 - 3612
  • [9] Characterizing Datasets for Social Visual Question Answering, and the New TinySocial Dataset
    Chen, Zhanwen
    Li, Shiyao
    Rashedi, Roxanne
    Zi, Xiaoman
    Elrod-Erickson, Morgan
    Hollis, Bryan
    Maliakal, Angela
    Shen, Xinyu
    Zhao, Simeng
    Kunda, Maithilee
    [J]. 10TH IEEE INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING AND EPIGENETIC ROBOTICS (ICDL-EPIROB 2020), 2020,
  • [10] Automatic question answering for multiple stakeholders, the epidemic question answering dataset
    Travis R. Goodwin
    Dina Demner-Fushman
    Kyle Lo
    Lucy Lu Wang
    Hoa T. Dang
    Ian M. Soboroff
    [J]. Scientific Data, 9