Medical visual question answering based on question-type reasoning and semantic space constraint

被引:11
|
作者
Wang, Meiling [1 ]
He, Xiaohai [1 ]
Liu, Luping [1 ]
Qing, Linbo [1 ]
Chen, Honggang [1 ]
Liu, Yan [2 ]
Ren, Chao [1 ]
机构
[1] Sichuan Univ, Coll Elect & Informat Engn, Chengdu 610065, Sichuan, Peoples R China
[2] Southwest Jiaotong Univ, Dept Neurol, Affiliated Hosp, Peoples Hosp 3, Chengdu, Sichuan, Peoples R China
基金
中国国家自然科学基金;
关键词
Medical visual question answering; Question -type reasoning; Semantic space constraint; Attention mechanism; DYNAMIC MEMORY NETWORKS; LANGUAGE;
D O I
10.1016/j.artmed.2022.102346
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Medical visual question answering (Med-VQA) aims to accurately answer clinical questions about medical images. Despite its enormous potential for application in the medical domain, the current technology is still in its infancy. Compared with general visual question answering task, Med-VQA task involve more demanding challenges. First, clinical questions about medical images are usually diverse due to different clinicians and the complexity of diseases. Consequently, noise is inevitably introduced when extracting question features. Second, Med-VQA task have always been regarded as a classification problem for predefined answers, ignoring the relationships between candidate responses. Thus, the Med-VQA model pays equal attention to all candidate answers when predicting answers. In this paper, a novel Med-VQA framework is proposed to alleviate the abovementioned problems. Specifically, we employed a question-type reasoning module severally to closed-ended and open-ended questions, thereby extracting the important information contained in the questions through an attention mechanism and filtering the noise to extract more valuable question features. To take advantage of the relational information between answers, we designed a semantic constraint space to calculate the similarity between the answers and assign higher attention to answers with high correlation. To evaluate the effectiveness of the proposed method, extensive experiments were conducted on a public dataset, namely VQA-RAD. Experimental results showed that the proposed method achieved better performance compared to other the state-ofthe-art methods. The overall accuracy, closed-ended accuracy, and open-ended accuracy reached 74.1 %, 82.7 %, and 60.9 %, respectively. It is worth noting that the absolute accuracy of the proposed method improved by 5.5 % for closed-ended questions.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] MMQL: Multi-Question Learning for Medical Visual Question Answering
    Chen, Qishen
    Bian, Minjie
    Xu, Huahu
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT V, 2024, 15005 : 480 - 489
  • [42] MVQAS: A Medical Visual Question Answering System
    Bai, Haoyue
    Shan, Xiaoyan
    Huang, Yefan
    Wang, Xiaoli
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 4675 - 4679
  • [43] Collaborative Aware Bidirectional Semantic Reasoning for Video Question Answering
    Wu, Xize
    Wu, Jiasong
    Zhu, Lei
    Senhadji, Lotfi
    Shu, Huazhong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2074 - 2086
  • [44] Localized Questions in Medical Visual Question Answering
    Tascon-Morales, Sergio
    Marquez-Neila, Pablo
    Sznitman, Raphael
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT II, 2023, 14221 : 361 - 370
  • [45] Semantic Text Recognition via Visual Question Answering
    Beltran, Viviana
    Journet, Nicholas
    Coustaty, Mickael
    Doucet, Antoine
    2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION WORKSHOPS (ICDARW), VOL 5, 2019, : 97 - 102
  • [46] Question Modifiers in Visual Question Answering
    Britton, William
    Sarkhel, Somdeb
    Venugopal, Deepak
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1472 - 1479
  • [47] Semantic Query Expansion based on Entity Association in Medical Question Answering
    Cui, Xue
    Zhai, Pengjun
    Fang, Yu
    4TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE APPLICATIONS AND TECHNOLOGIES (AIAAT 2020), 2020, 1642
  • [48] Knowledge-based question answering using the semantic embedding space
    Yang, Min-Chul
    Lee, Do-Gil
    Park, So-Young
    Rim, Hae-Chang
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (23) : 9086 - 9104
  • [49] HUMAN GUIDED CROSS-MODAL REASONING WITH SEMANTIC ATTENTION LEARNING FOR VISUAL QUESTION ANSWERING
    Liao, Lei
    Feng, Mao
    Yang, Meng
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 2775 - 2779
  • [50] Visual Question Answering Research on Joint Knowledge and Visual Information Reasoning
    Su, Zhenqiang
    Gou, Gang
    Computer Engineering and Applications, 2024, 60 (05) : 95 - 102