Retrieved Generative Captioning for Medical Images

被引：0

作者：

Beddiar, Djamila Romaissa ^{[1
]}

Oussalah, Mourad ^{[2
]}

Seppanen, Tapio ^{[1
]}

机构：

[1] Univ Oulu, Ctr Machine Vis & Signal Anal, Oulu, Finland

[2] Univ Oulu, Fac Med, Oulu, Finland

来源：

20TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING, CBMI 2023 | 2023年

基金：

芬兰科学院;

关键词：

Image Captioning; Medical Images; Neural Networks; Retrievalbased; Captioning; Generative-based Captioning;

D O I：

10.1145/3617233.3617246

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Understanding the content of medical images and mapping it into text is a very trending topic in intersection of two main domains; computer vision and natural language processing. This is known as medical image captioning, which plays a vital role in developing automatic systems for diagnosis purposes. Recent research in the medical field provided promising results for both deep-learning based and retrieval-based models for image captioning. However, each one of them has its own drawbacks, that can be overcome if combined. In addition, existing diagnosis systems are still not able to provide enough explanation about the findings, which might be similar to what a physician can deliver. In this regard, we present in this paper a combination of a generative deep-learning based method and a retrieval-based model for medical image captioning. First, we train an attention-based encoder-decoder model to generate new captions for given medical images. Then, we fit the generated caption from the generative model to the retrieval-based model, which retrieves the most similar caption from the training database. This multi-stage approach allows us to generate most important words of the caption (with the generative model) and then search for the most close caption that includes such words (with the retrieval-based model). Another way of combining both models is by selecting at each time the caption with highest score among generated and retrieved captions. We evaluate our proposed model on the medical ROCO dataset for which we achieved a BLEU-4 score of 07.89 for the radiology class and 03.19 for the out-of-class data, for the multi-stage model. Similarly, best results were achieved for the fused model (predicted caption is the best among generated and retrieved) where we obtain a BLEU-4 values of 18.61 for the radiology class and 13.28 for the out-of-class data. Even though our results seem to be low, they outperformed the state-of-the-art results on the same dataset and could be further improved.

引用

页码：48 / 54

页数：7

共 50 条

[1] Medical image captioning via generative pretrained transformers
Alexander Selivanov
Oleg Y. Rogov
Daniil Chesakov
Artem Shelmanov
Irina Fedulova
Dmitry V. Dylov
[J]. Scientific Reports, 13
[2] Medical image captioning via generative pretrained transformers
Selivanov, Alexander
Rogov, Oleg Y.
Chesakov, Daniil
Shelmanov, Artem
Fedulova, Irina
Dylov, Dmitry V.
[J]. SCIENTIFIC REPORTS, 2023, 13 (01)
[3] Advanced Generative Deep Learning Techniques for Accurate Captioning of Images
Chandar, J. Navin
Kavitha, G.
[J]. WIRELESS PERSONAL COMMUNICATIONS, 2024,
[4] A systematic comparison of generative models for medical images
Hristina Uzunova
Matthias Wilms
Nils D. Forkert
Heinz Handels
Jan Ehrhardt
[J]. International Journal of Computer Assisted Radiology and Surgery, 2022, 17 : 1213 - 1224
[5] A systematic comparison of generative models for medical images
Uzunova, Hristina
Wilms, Matthias
Forkert, Nils D.
Handels, Heinz
Ehrhardt, Jan
[J]. INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2022, 17 (07) : 1213 - 1224
[6] Image Captioning with Generative Adversarial Network
Amirian, Soheyla
Rasheed, Khaled
Taha, Thiab R.
Arabnia, Hamid R.
[J]. 2019 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2019), 2019, : 272 - 275
[7] Generative schemes for drug design with shape captioning
Shikhar Shasya
Shubham Sharma
Prabhakar Bhimalapuram
[J]. Journal of Chemical Sciences, 135
[8] Generative schemes for drug design with shape captioning
Shasya, Shikhar
Sharma, Shubham
Bhimalapuram, Prabhakar
[J]. JOURNAL OF CHEMICAL SCIENCES, 2023, 135 (03)
[9] Captioning Ultrasound Images Automatically
Alsharid, Mohammad
Sharma, Harshita
Drukker, Lior
Chatelain, Pierre
Papageorghiou, Aris T.
Noble, J. Alison
[J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT IV, 2019, 11767 : 338 - 346
[10] Captioning the Images: A Deep Analysis
Chaudhari, Chaitrali P.
Devane, Satish
[J]. COMPUTING, COMMUNICATION AND SIGNAL PROCESSING, ICCASP 2018, 2019, 810 : 987 - 999

← 1 2 3 4 5 →