Medical Visual Question Answering via Conditional Reasoning and Contrastive Learning

被引:4
|
作者
Liu, Bo [1 ]
Zhan, Li-Ming [1 ]
Xu, Li [1 ]
Wu, Xiao-Ming [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hung Hom, Hong Kong, Peoples R China
关键词
Medical visual question answering; conditional reasoning; contrastive learning;
D O I
10.1109/TMI.2022.3232411
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Medical visual question answering (Med-VQA) aims to accurately answer a clinical question presented with a medical image. Despite its enormous potential in healthcare services, the development of this technology is still in the initial stage. On the one hand, Med-VQA tasks are highly challenging due to the massive diversity of clinical questions that require different visual reasoning skills for different types of questions. On the other hand, medical images are complex in nature and very different from natural images, while current Med-VQA datasets are small-scale with a few hundred radiology images, making it difficult to train a well-performing visual feature extractor. This paper addresses above two critical issues. We propose a novel conditional reasoning mechanism with a question-conditioned reasoning component and a type-conditioned reasoning strategy to learn effective reasoning skills for different Med-VQA tasks adaptively. Further, we propose to pre-train a visual feature extractor for Med-VQA via contrastive learning on large amounts of unlabeled radiology images. The effectiveness of our proposals is validated by extensive experiments on existing Med-VQA benchmarks, which show significant improvement of our model in prediction accuracy over state-of-the-artmethods. The source code and pre-training dataset are provided at https://github.com/Awenbocc/CPCR.
引用
收藏
页码:1532 / 1545
页数:14
相关论文
共 50 条
  • [1] Medical Visual Question Answering via Conditional Reasoning
    Zhan, Li-Ming
    Liu, Bo
    Fan, Lu
    Chen, Jiaxin
    Wu, Xiao-Ming
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2345 - 2354
  • [2] Improving reasoning with contrastive visual information for visual question answering
    Long, Yu
    Tang, Pengjie
    Wang, Hanli
    Yu, Jian
    [J]. ELECTRONICS LETTERS, 2021, 57 (20) : 758 - 760
  • [3] Multimodal Learning and Reasoning for Visual Question Answering
    Ilievski, Ilija
    Feng, Jiashi
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [4] Bidirectional Contrastive Split Learning for Visual Question Answering
    Sun, Yuwei
    Ochiai, Hideya
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 19, 2024, : 21602 - 21609
  • [5] Contrastive training of a multimodal encoder for medical visual question answering
    Silva, Joao Daniel
    Martins, Bruno
    Magalhaes, Joao
    [J]. INTELLIGENT SYSTEMS WITH APPLICATIONS, 2023, 18
  • [6] INTERPRETABLE VISUAL QUESTION ANSWERING VIA REASONING SUPERVISION
    Parelli, Maria
    Mallis, Dimitrios
    Diomataris, Markos
    Pitsikalis, Vassilis
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2525 - 2529
  • [7] VQAMix: Conditional Triplet Mixup for Medical Visual Question Answering
    Gong, Haifan
    Chen, Guanqi
    Mao, Mingzhi
    Li, Zhen
    Li, Guanbin
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2022, 41 (11) : 3332 - 3343
  • [8] Long Context Question Answering via Supervised Contrastive Learning
    Caciularu, Avi
    Dagan, Ido
    Goldberger, Jacob
    Cohan, Arman
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2872 - 2879
  • [9] Learning a Mixture of Conditional Gating Blocks for Visual Question Answering
    Sun, Qiang
    Fu, Yan-Wei
    Xue, Xiang-Yang
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2024, 39 (04) : 912 - 928
  • [10] HCCL: Hierarchical Counterfactual Contrastive Learning for Robust Visual Question Answering
    Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences, China
    不详
    [J]. ACM Trans. Multimedia Comput. Commun. Appl, 2024, 10