OVQA: A Clinically Generated Visual Question Answering Dataset

被引:4
|
作者
Huang, Yefan [1 ]
Wang, Xiaoli [1 ]
Liu, Feiyan [1 ]
Huang, Guofeng [2 ]
机构
[1] Xiamen Univ, Sch Informat, Xiamen, Peoples R China
[2] Xiamen Univ, Affiliated Southeast Hosp, Xiamen, Peoples R China
基金
中国国家自然科学基金;
关键词
Medical visual question answering; Benchmarking dataset; Semiautomatic data generation;
D O I
10.1145/3477495.3531724
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Medical visual question answering (Med-VQA) is a challenging problem that aims to take a medical image and a clinical question about the image as input and output a correct answer in natural language. Current medical systems often require large-scale and high-quality labeled data for training and evaluation. To address the challenge, we present a new dataset, denoted by OVQA, which is generated from electronic medical records. We develop a semi-automatic data generation tool for constructing the dataset. First, medical entities are automatically extracted from medical records and filled into predefined templates for generating question and answer pairs. These pairs are then combined with medical images extracted from corresponding medical records, to generate candidates for visual question answering (VQA). The candidates are finally verified with high-quality labels annotated by experienced physicians. To evaluate the quality of OVQA, we conduct comprehensive experiments on state-of-the-art methods for the Med-VQA task to our dataset. The results show that our OVQA can be used as a benchmarking dataset for evaluating existing Med-VQA systems. The dataset can be downloaded from http://47.94.174.82/.
引用
收藏
页码:2924 / 2938
页数:15
相关论文
共 50 条
  • [31] PubMedQA: A Dataset for Biomedical Research Question Answering
    Jin, Qiao
    Dhingra, Bhuwan
    Liu, Zhengping
    Cohen, William W.
    Lu, Xinghua
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 2567 - 2577
  • [32] Visual Question Generation as Dual Task of Visual Question Answering
    Li, Yikang
    Duan, Nan
    Zhou, Bolei
    Chu, Xiao
    Ouyang, Wanli
    Wang, Xiaogang
    Zhou, Ming
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6116 - 6124
  • [33] ArabicaQA: A Comprehensive Dataset for Arabic Question Answering
    Abdallah, Abdelrahman
    Kasem, Mahmoud
    Abdalla, Mahmoud
    Mahmoud, Mohamed
    Elkasaby, Mohamed
    Elbendary, Yasser
    Jatowt, Adam
    [J]. PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2049 - 2059
  • [34] VQuAD: Video Question Answering Diagnostic Dataset
    Gupta, Vivek
    Patro, Badri N.
    Parihar, Hemant
    Namboodiri, Vinay P.
    [J]. 2022 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW 2022), 2022, : 282 - 291
  • [35] PerCQA: Persian Community Question Answering Dataset
    Jamali, Naghme
    Yaghoobzadeh, Yadollah
    Faili, Heshaam
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6083 - 6092
  • [36] MemoriQA: A Question-Answering Lifelog Dataset
    Tran, Quang-Linh
    Nguyen, Binh
    Jones, Gareth J. F.
    Gurrin, Cathal
    [J]. PROCEEDINGS OF THE FIRST ACM WORKSHOP ON AI-POWERED QUESTION ANSWERING SYSTEMS FOR MULTIMEDIA, AIQAM 2024, 2024, : 7 - 12
  • [37] Towards a Polish Question Answering Dataset (PoQuAD)
    Tuora, Ryszard
    Zawadzka-Paluektau, Natalia
    Klamra, Cezary
    Zwierzchowska, Aleksandra
    Kobylinski, Lukasz
    [J]. FROM BORN-PHYSICAL TO BORN-VIRTUAL: AUGMENTING INTELLIGENCE IN DIGITAL LIBRARIES, ICADL 2022, 2022, 13636 : 194 - 203
  • [38] TutorialVQA: Question Answering Dataset for Tutorial Videos
    Colas, Anthony
    Kim, Seokhwan
    Dernoncourt, Franck
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 5450 - 5455
  • [39] QED: A Framework and Dataset for Explanations in Question Answering
    Lamm, Matthew
    Palomaki, Jennimaria
    Alberti, Chris
    Andor, Daniel
    Choi, Eunsol
    Soares, Livio Baldini
    Collins, Michael
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2021, 9 : 790 - 806
  • [40] A Portuguese Dataset for Evaluation of Semantic Question Answering
    de Araujo, Denis Andrei
    Rigo, Sandro Jose
    Quaresma, Paulo
    Muniz, Joao Henrique
    [J]. COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2020, 2020, 12037 : 217 - 227