Retrieved Generative Captioning for Medical Images

被引:0
|
作者
Beddiar, Djamila Romaissa [1 ]
Oussalah, Mourad [2 ]
Seppanen, Tapio [1 ]
机构
[1] Univ Oulu, Ctr Machine Vis & Signal Anal, Oulu, Finland
[2] Univ Oulu, Fac Med, Oulu, Finland
基金
芬兰科学院;
关键词
Image Captioning; Medical Images; Neural Networks; Retrievalbased; Captioning; Generative-based Captioning;
D O I
10.1145/3617233.3617246
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Understanding the content of medical images and mapping it into text is a very trending topic in intersection of two main domains; computer vision and natural language processing. This is known as medical image captioning, which plays a vital role in developing automatic systems for diagnosis purposes. Recent research in the medical field provided promising results for both deep-learning based and retrieval-based models for image captioning. However, each one of them has its own drawbacks, that can be overcome if combined. In addition, existing diagnosis systems are still not able to provide enough explanation about the findings, which might be similar to what a physician can deliver. In this regard, we present in this paper a combination of a generative deep-learning based method and a retrieval-based model for medical image captioning. First, we train an attention-based encoder-decoder model to generate new captions for given medical images. Then, we fit the generated caption from the generative model to the retrieval-based model, which retrieves the most similar caption from the training database. This multi-stage approach allows us to generate most important words of the caption (with the generative model) and then search for the most close caption that includes such words (with the retrieval-based model). Another way of combining both models is by selecting at each time the caption with highest score among generated and retrieved captions. We evaluate our proposed model on the medical ROCO dataset for which we achieved a BLEU-4 score of 07.89 for the radiology class and 03.19 for the out-of-class data, for the multi-stage model. Similarly, best results were achieved for the fused model (predicted caption is the best among generated and retrieved) where we obtain a BLEU-4 values of 18.61 for the radiology class and 13.28 for the out-of-class data. Even though our results seem to be low, they outperformed the state-of-the-art results on the same dataset and could be further improved.
引用
收藏
页码:48 / 54
页数:7
相关论文
共 50 条
  • [1] Medical image captioning via generative pretrained transformers
    Alexander Selivanov
    Oleg Y. Rogov
    Daniil Chesakov
    Artem Shelmanov
    Irina Fedulova
    Dmitry V. Dylov
    [J]. Scientific Reports, 13
  • [2] Medical image captioning via generative pretrained transformers
    Selivanov, Alexander
    Rogov, Oleg Y.
    Chesakov, Daniil
    Shelmanov, Artem
    Fedulova, Irina
    Dylov, Dmitry V.
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)
  • [3] Advanced Generative Deep Learning Techniques for Accurate Captioning of Images
    Chandar, J. Navin
    Kavitha, G.
    [J]. WIRELESS PERSONAL COMMUNICATIONS, 2024,
  • [4] A systematic comparison of generative models for medical images
    Hristina Uzunova
    Matthias Wilms
    Nils D. Forkert
    Heinz Handels
    Jan Ehrhardt
    [J]. International Journal of Computer Assisted Radiology and Surgery, 2022, 17 : 1213 - 1224
  • [5] A systematic comparison of generative models for medical images
    Uzunova, Hristina
    Wilms, Matthias
    Forkert, Nils D.
    Handels, Heinz
    Ehrhardt, Jan
    [J]. INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2022, 17 (07) : 1213 - 1224
  • [6] Image Captioning with Generative Adversarial Network
    Amirian, Soheyla
    Rasheed, Khaled
    Taha, Thiab R.
    Arabnia, Hamid R.
    [J]. 2019 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2019), 2019, : 272 - 275
  • [7] Generative schemes for drug design with shape captioning
    Shikhar Shasya
    Shubham Sharma
    Prabhakar Bhimalapuram
    [J]. Journal of Chemical Sciences, 135
  • [8] Generative schemes for drug design with shape captioning
    Shasya, Shikhar
    Sharma, Shubham
    Bhimalapuram, Prabhakar
    [J]. JOURNAL OF CHEMICAL SCIENCES, 2023, 135 (03)
  • [9] Captioning Ultrasound Images Automatically
    Alsharid, Mohammad
    Sharma, Harshita
    Drukker, Lior
    Chatelain, Pierre
    Papageorghiou, Aris T.
    Noble, J. Alison
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT IV, 2019, 11767 : 338 - 346
  • [10] Captioning the Images: A Deep Analysis
    Chaudhari, Chaitrali P.
    Devane, Satish
    [J]. COMPUTING, COMMUNICATION AND SIGNAL PROCESSING, ICCASP 2018, 2019, 810 : 987 - 999