Medical image captioning via generative pretrained transformers

被引:18
|
作者
Selivanov, Alexander [1 ,2 ]
Rogov, Oleg Y. [1 ]
Chesakov, Daniil [1 ,3 ]
Shelmanov, Artem [1 ,3 ]
Fedulova, Irina [2 ]
Dylov, Dmitry V. [1 ]
机构
[1] Skolkovo Inst Sci & Technol, Bolshoy blvd,30-1, Moscow 121205, Russia
[2] Philips Russia, Skolkovo Technopark 42,Bldg 1,Bolshoi Blvd, Moscow 121205, Russia
[3] AIRI, Kutuzovsky Ave,32 bld 1, Moscow 121170, Russia
关键词
NETWORKS;
D O I
10.1038/s41598-023-31223-5
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The proposed model for automatic clinical image caption generation combines the analysis of radiological scans with structured patient information from the textual records. It uses two language models, the Show-Attend-Tell and the GPT-3, to generate comprehensive and descriptive radiology records. The generated textual summary contains essential information about pathologies found, their location, along with the 2D heatmaps that localize each pathology on the scans. The model has been tested on two medical datasets, the Open-I, MIMIC-CXR, and the general-purpose MS-COCO, and the results measured with natural language assessment metrics demonstrated its efficient applicability to chest X-ray image captioning.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Medical image captioning via generative pretrained transformers
    Alexander Selivanov
    Oleg Y. Rogov
    Daniil Chesakov
    Artem Shelmanov
    Irina Fedulova
    Dmitry V. Dylov
    [J]. Scientific Reports, 13
  • [2] Image Captioning with Pretrained Language Generators
    Vishnubhatla, Saketh
    Sinha, Nishant
    [J]. CODS-COMAD 2021: PROCEEDINGS OF THE 3RD ACM INDIA JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE & MANAGEMENT OF DATA (8TH ACM IKDD CODS & 26TH COMAD), 2021, : 427 - 427
  • [3] medigan: a Python']Python library of pretrained generative models for medical image synthesis
    Osuala, Richard
    Skorupko, Grzegorz
    Lazrak, Noussair
    Garrucho, Lidia
    Garcia, Eloy
    Joshi, Smriti
    Jouide, Socayna
    Rutherford, Michael
    Prior, Fred
    Kushibar, Kaisar
    Diaz, Oliver
    Lekadir, Karim
    [J]. JOURNAL OF MEDICAL IMAGING, 2023, 10 (06)
  • [4] Image Captioning with Generative Adversarial Network
    Amirian, Soheyla
    Rasheed, Khaled
    Taha, Thiab R.
    Arabnia, Hamid R.
    [J]. 2019 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2019), 2019, : 272 - 275
  • [5] Retrieved Generative Captioning for Medical Images
    Beddiar, Djamila Romaissa
    Oussalah, Mourad
    Seppanen, Tapio
    [J]. 20TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING, CBMI 2023, 2023, : 48 - 54
  • [6] Muse: Text-To-Image Generation via Masked Generative Transformers
    Chang, Huiwen
    Zhang, Han
    Barber, Jarred
    Maschinot, A. J.
    Lezama, Jose
    Jiang, Lu
    Yang, Ming-Hsuan
    Murphy, Kevin
    Freeman, William T.
    Rubinstein, Michael
    Li, Yuanzhen
    Krishnan, Dilip
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [7] Explainability for Medical Image Captioning
    Beddiar, Djamila
    Oussalah, Mourad
    Tapio, Seppanen
    [J]. 2022 ELEVENTH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS (IPTA), 2022,
  • [8] SynthCap: Augmenting Transformers with Synthetic Data for Image Captioning
    Caffagni, Davide
    Barraco, Manuele
    Cornia, Marcella
    Baraldi, Lorenzo
    Cucchiara, Rita
    [J]. IMAGE ANALYSIS AND PROCESSING, ICIAP 2023, PT I, 2023, 14233 : 112 - 123
  • [9] ITrans: generative image inpainting with transformers
    Miao, Wei
    Wang, Lijun
    Lu, Huchuan
    Huang, Kaining
    Shi, Xinchu
    Liu, Bocong
    [J]. MULTIMEDIA SYSTEMS, 2024, 30 (01)
  • [10] ITrans: generative image inpainting with transformers
    Wei Miao
    Lijun Wang
    Huchuan Lu
    Kaining Huang
    Xinchu Shi
    Bocong Liu
    [J]. Multimedia Systems, 2024, 30