Medical image captioning via generative pretrained transformers

被引：18

作者：

Selivanov, Alexander ^{[1
,2
]}

Rogov, Oleg Y. ^{[1
]}

Chesakov, Daniil ^{[1
,3
]}

Shelmanov, Artem ^{[1
,3
]}

Fedulova, Irina ^{[2
]}

Dylov, Dmitry V. ^{[1
]}

机构：

[1] Skolkovo Inst Sci & Technol, Bolshoy blvd,30-1, Moscow 121205, Russia

[2] Philips Russia, Skolkovo Technopark 42,Bldg 1,Bolshoi Blvd, Moscow 121205, Russia

[3] AIRI, Kutuzovsky Ave,32 bld 1, Moscow 121170, Russia

来源：

SCIENTIFIC REPORTS | 2023年 / 13卷 / 01期

关键词：

NETWORKS;

D O I：

10.1038/s41598-023-31223-5

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

The proposed model for automatic clinical image caption generation combines the analysis of radiological scans with structured patient information from the textual records. It uses two language models, the Show-Attend-Tell and the GPT-3, to generate comprehensive and descriptive radiology records. The generated textual summary contains essential information about pathologies found, their location, along with the 2D heatmaps that localize each pathology on the scans. The model has been tested on two medical datasets, the Open-I, MIMIC-CXR, and the general-purpose MS-COCO, and the results measured with natural language assessment metrics demonstrated its efficient applicability to chest X-ray image captioning.

引用

页数：12

共 50 条

[1] Medical image captioning via generative pretrained transformers
Alexander Selivanov
Oleg Y. Rogov
Daniil Chesakov
Artem Shelmanov
Irina Fedulova
Dmitry V. Dylov
[J]. Scientific Reports, 13
[2] Image Captioning with Pretrained Language Generators
Vishnubhatla, Saketh
Sinha, Nishant
[J]. CODS-COMAD 2021: PROCEEDINGS OF THE 3RD ACM INDIA JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE & MANAGEMENT OF DATA (8TH ACM IKDD CODS & 26TH COMAD), 2021, : 427 - 427
[3] medigan: a Python']Python library of pretrained generative models for medical image synthesis
Osuala, Richard
Skorupko, Grzegorz
Lazrak, Noussair
Garrucho, Lidia
Garcia, Eloy
Joshi, Smriti
Jouide, Socayna
Rutherford, Michael
Prior, Fred
Kushibar, Kaisar
Diaz, Oliver
Lekadir, Karim
[J]. JOURNAL OF MEDICAL IMAGING, 2023, 10 (06)
[4] Image Captioning with Generative Adversarial Network
Amirian, Soheyla
Rasheed, Khaled
Taha, Thiab R.
Arabnia, Hamid R.
[J]. 2019 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2019), 2019, : 272 - 275
[5] Retrieved Generative Captioning for Medical Images
Beddiar, Djamila Romaissa
Oussalah, Mourad
Seppanen, Tapio
[J]. 20TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING, CBMI 2023, 2023, : 48 - 54
[6] Muse: Text-To-Image Generation via Masked Generative Transformers
Chang, Huiwen
Zhang, Han
Barber, Jarred
Maschinot, A. J.
Lezama, Jose
Jiang, Lu
Yang, Ming-Hsuan
Murphy, Kevin
Freeman, William T.
Rubinstein, Michael
Li, Yuanzhen
Krishnan, Dilip
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
[7] Explainability for Medical Image Captioning
Beddiar, Djamila
Oussalah, Mourad
Tapio, Seppanen
[J]. 2022 ELEVENTH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS (IPTA), 2022,
[8] SynthCap: Augmenting Transformers with Synthetic Data for Image Captioning
Caffagni, Davide
Barraco, Manuele
Cornia, Marcella
Baraldi, Lorenzo
Cucchiara, Rita
[J]. IMAGE ANALYSIS AND PROCESSING, ICIAP 2023, PT I, 2023, 14233 : 112 - 123
[9] ITrans: generative image inpainting with transformers
Miao, Wei
Wang, Lijun
Lu, Huchuan
Huang, Kaining
Shi, Xinchu
Liu, Bocong
[J]. MULTIMEDIA SYSTEMS, 2024, 30 (01)
[10] ITrans: generative image inpainting with transformers
Wei Miao
Lijun Wang
Huchuan Lu
Kaining Huang
Xinchu Shi
Bocong Liu
[J]. Multimedia Systems, 2024, 30

← 1 2 3 4 5 →