Diffusion model-based text-guided enhancement network for medical image segmentation

被引:1
|
作者
Dong, Zhiwei [1 ]
Yuan, Genji [1 ]
Hua, Zhen [1 ]
Li, Jinjiang [2 ]
机构
[1] Shandong Technol & Business Univ, Sch Comp Sci & Technol, Yantai, Peoples R China
[2] Shandong Technol & Business Univ, Sch Informat & Elect Engn, Yantai, Peoples R China
基金
中国国家自然科学基金;
关键词
Denoising diffusion model; Text attention mechanism; Guided feature enhancement; Medical image segmentation; CONVOLUTIONAL NEURAL-NETWORK; CELL-NUCLEI; MISDIAGNOSIS; CLASSIFICATION;
D O I
10.1016/j.eswa.2024.123549
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, denoising diffusion models have achieved remarkable success in generating pixel-level representations with semantic values for image generation modeling. In this study, we propose a novel end -toend framework, called TGEDiff, focusing on medical image segmentation. TGEDiff fuses a textual attention mechanism with the diffusion model by introducing an additional auxiliary categorization task to guide the diffusion model with textual information to generate excellent pixel-level representations. To overcome the limitation of limited perceptual fields for independent feature encoders within the diffusion model, we introduce a multi-kernel excitation module to extend the model's perceptual capability. Meanwhile, a guided feature enhancement module is introduced in Denoising-UNet to focus the model's attention on important regions and attenuate the influence of noise and irrelevant background in medical images. We critically evaluated TGEDiff on three datasets (Kvasir-SEG, Kvasir-Sessile, and GLaS), and TGEDiff achieved significant improvements over the state -of -the -art approach on all three datasets, with F1 scores and mIoU improving by 0.88% and 1.09%, 3.21% and 3.43%, respectively, 1.29% and 2.34%. These data validate that TGEDiff has excellent performance in medical image segmentation. TGEDiff is expected to facilitate accurate diagnosis and treatment of medical diseases through more precise deconvolutional structural segmentation.
引用
下载
收藏
页数:18
相关论文
共 50 条
  • [1] GENERATIVE ADVERSARIAL NETWORK INCLUDING REFERRING IMAGE SEGMENTATION FOR TEXT-GUIDED IMAGE MANIPULATION
    Watanabe, Yuto
    Togo, Ren
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4818 - 4822
  • [2] Text-Guided Cross-Position Attention for Segmentation: Case of Medical Image
    Lee, Go-Eun
    Kim, Seon Ho
    Cho, Jungchan
    Choi, Sang Tae
    Choi, Sang-Il
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT V, 2023, 14224 : 537 - 546
  • [3] SEGMENTATION-AWARE TEXT-GUIDED IMAGE MANIPULATION
    Haruyama, Tomoki
    Togo, Ren
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2433 - 2437
  • [4] Text-guided image-to-sketch diffusion models☆
    Ke, Aihua
    Huang, Yujie
    Cai, Bo
    Yang, Jie
    KNOWLEDGE-BASED SYSTEMS, 2024, 304
  • [5] Text-Guided Image Manipulation via Generative Adversarial Network With Referring Image Segmentation-Based Guidance
    Watanabe, Yuto
    Togo, Ren
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    IEEE ACCESS, 2023, 11 : 42534 - 42545
  • [6] Text-Guided Attention Model for Image Captioning
    Mun, Jonghwan
    Cho, Minsu
    Han, Bohyung
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4233 - 4239
  • [7] DTAN: Diffusion-based Text Attention Network for medical image segmentation
    Zhao, Yiyang
    Li, Jinjiang
    Ren, Lu
    Chen, Zheng
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 168
  • [8] Text-Guided Image Inpainting
    Zhang, Zijian
    Zhao, Zhou
    Zhang, Zhu
    Huai, Baoxing
    Yuan, Jing
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 4079 - 4087
  • [9] A Text-Guided Generation and Refinement Model for Image Captioning
    Wang, Depeng
    Hu, Zhenzhen
    Zhou, Yuanen
    Hong, Richang
    Wang, Meng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2966 - 2977
  • [10] Text-Guided Molecule Generation with Diffusion Language Model
    Gong, Haisong
    Liu, Qiang
    Wu, Shu
    Wang, Liang
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 1, 2024, : 109 - 117