Multi-modal multi-head self-attention for medical VQA

被引:0
|
作者
Vasudha Joshi
Pabitra Mitra
Supratik Bose
机构
[1] Computer Science and Engineering,
[2] Indian Institute of Technology,undefined
[3] Varian Medical Systems Inc.,undefined
来源
Multimedia Tools and Applications | 2024年 / 83卷
关键词
Medical visual question answering; Multi-head self-attention; DistilBERT; VQA-Med 2019;
D O I
暂无
中图分类号
学科分类号
摘要
Medical Visual Question answering (MedVQA) systems provide answers to questions based on radiology images. Medical images are more complex than general images. They have low contrast and are very similar to one another. The difference between medical images can only be understood by medical practitioners. While general images have very high quality and their differences can easily be spotted by anyone. Therefore, methods used for general-domain Visual Question Answering (VQA) Systems can not be used directly. The performance of MedVQA systems depends mainly on the method used to combine the features of the two input modalities: medical image and question. In this work, we propose an architecturally simple fusion strategy that uses multi-head self-attention to combine medical images and questions of the VQA-Med dataset of the ImageCLEF 2019 challenge. The model captures long-range dependencies between input modalities using the attention mechanism of the Transformer. We have experimentally shown that the representational power of the model is improved by increasing the length of the embeddings, used in the transformer. We have achieved an overall accuracy of 60.0% which improves by 1.35% from the existing model. We have also performed the ablation study to elucidate the importance of each model component.
引用
收藏
页码:42585 / 42608
页数:23
相关论文
共 50 条
  • [41] Chinese CNER Combined with Multi-head Self-attention and BiLSTM-CRF
    Luo X.
    Xia X.
    An Y.
    Chen X.
    Hunan Daxue Xuebao/Journal of Hunan University Natural Sciences, 2021, 48 (04): : 45 - 55
  • [42] Ship detection algorithm in complex backgrounds via multi-head self-attention
    Yu N.-J.
    Fan X.-B.
    Deng T.-M.
    Mao G.-T.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2022, 56 (12): : 2392 - 2402
  • [43] Multi-Head Self-Attention for 3D Point Cloud Classification
    Gao, Xue-Yao
    Wang, Yan-Zhao
    Zhang, Chun-Xiang
    Lu, Jia-Qi
    IEEE Access, 2021, 9 : 18137 - 18147
  • [44] The sentiment analysis model with multi-head self-attention and Tree-LSTM
    Li Lei
    Pei Yijian
    Jin Chenyang
    SIXTH INTERNATIONAL WORKSHOP ON PATTERN RECOGNITION, 2021, 11913
  • [45] Text summarization based on multi-head self-attention mechanism and pointer network
    Dong Qiu
    Bing Yang
    Complex & Intelligent Systems, 2022, 8 : 555 - 567
  • [46] MSnet: Multi-Head Self-Attention Network for Distantly Supervised Relation Extraction
    Sun, Tingting
    Zhang, Chunhong
    Ji, Yang
    Hu, Zheng
    IEEE ACCESS, 2019, 7 : 54472 - 54482
  • [47] Multi-Head Self-Attention Generative Adversarial Networks for Multiphysics Topology Optimization
    Parrott, Corey M.
    Abueidda, Diab W.
    James, Kai A.
    AIAA JOURNAL, 2023, 61 (02) : 726 - 738
  • [48] Multi-Head Self-Attention for 3D Point Cloud Classification
    Gao, Xue-Yao
    Wang, Yan-Zhao
    Zhang, Chun-Xiang
    Lu, Jia-Qi
    IEEE ACCESS, 2021, 9 : 18137 - 18147
  • [49] A Supervised Multi-Head Self-Attention Network for Nested Named Entity Recognition
    Xu, Yongxiu
    Huang, Heyan
    Feng, Chong
    Hu, Yue
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14185 - 14193
  • [50] Feature weighting concatenated multi-head self-attention for amputee EMG classification
    Bilgin, Metin
    Mert, Ahmet
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 103