Medical image captioning via generative pretrained transformers

被引:18
|
作者
Selivanov, Alexander [1 ,2 ]
Rogov, Oleg Y. [1 ]
Chesakov, Daniil [1 ,3 ]
Shelmanov, Artem [1 ,3 ]
Fedulova, Irina [2 ]
Dylov, Dmitry V. [1 ]
机构
[1] Skolkovo Inst Sci & Technol, Bolshoy blvd,30-1, Moscow 121205, Russia
[2] Philips Russia, Skolkovo Technopark 42,Bldg 1,Bolshoi Blvd, Moscow 121205, Russia
[3] AIRI, Kutuzovsky Ave,32 bld 1, Moscow 121170, Russia
关键词
NETWORKS;
D O I
10.1038/s41598-023-31223-5
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The proposed model for automatic clinical image caption generation combines the analysis of radiological scans with structured patient information from the textual records. It uses two language models, the Show-Attend-Tell and the GPT-3, to generate comprehensive and descriptive radiology records. The generated textual summary contains essential information about pathologies found, their location, along with the 2D heatmaps that localize each pathology on the scans. The model has been tested on two medical datasets, the Open-I, MIMIC-CXR, and the general-purpose MS-COCO, and the results measured with natural language assessment metrics demonstrated its efficient applicability to chest X-ray image captioning.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Generative adversarial network for semi-supervised image captioning
    Liang, Xu
    Li, Chen
    Tian, Lihua
    [J]. Computer Vision and Image Understanding, 2024, 249
  • [22] Interactions Guided Generative Adversarial Network for unsupervised image captioning
    Cao, Shan
    An, Gaoyun
    Zheng, Zhenxing
    Ruan, Qiuqi
    [J]. NEUROCOMPUTING, 2020, 417 : 419 - 431
  • [23] A Novel Image Captioning Method Based on Generative Adversarial Networks
    Fan, Yang
    Xu, Jungang
    Sun, Yingfei
    Wang, Yiyu
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: TEXT AND TIME SERIES, PT IV, 2019, 11730 : 281 - 292
  • [24] TextCycleGAN: Cyclical-Generative Adversarial Networks for Image Captioning
    Alam, Mohammad
    Isoda, Nicole
    Manzanares, Mitch
    Delgado, Anthony
    Panggabean, Antonius F.
    [J]. ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS III, 2021, 11746
  • [25] TRIPLE SEQUENCE GENERATIVE ADVERSARIAL NETS FOR UNSUPERVISED IMAGE CAPTIONING
    Zhou, Yucheng
    Tao, Wei
    Zhang, Wenqiang
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7598 - 7602
  • [26] Engaging Image Captioning via Personality
    Shuster, Kurt
    Humeau, Samuel
    Hu, Hexiang
    Bordes, Antoine
    Weston, Jason
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 12508 - 12518
  • [27] Knowledge Neurons in Pretrained Transformers
    Dai, Damai
    Dong, Li
    Hao, Yaru
    Sui, Zhifang
    Chang, Baobao
    Wei, Furu
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 8493 - 8502
  • [28] Finetuning Pretrained Transformers into RNNs
    Kasai, Jungo
    Peng, Hao
    Zhang, Yizhe
    Yogatama, Dani
    Ilharco, Gabriel
    Pappas, Nikolaos
    Mao, Yi
    Chen, Weizhu
    Smith, Noah A.
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 10630 - 10643
  • [29] Multi-Attention Generative Adversarial Network for image captioning
    Wei, Yiwei
    Wang, Leiquan
    Cao, Haiwen
    Shao, Mingwen
    Wu, Chunlei
    [J]. NEUROCOMPUTING, 2020, 387 : 91 - 99
  • [30] Controllable Image Captioning via Prompting
    Wang, Ning
    Xie, Jiahao
    Wu, Jihao
    Jia, Mingbo
    Li, Linlin
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 2617 - 2625