Diffusion model-based text-guided enhancement network for medical image segmentation

被引:1
|
作者
Dong, Zhiwei [1 ]
Yuan, Genji [1 ]
Hua, Zhen [1 ]
Li, Jinjiang [2 ]
机构
[1] Shandong Technol & Business Univ, Sch Comp Sci & Technol, Yantai, Peoples R China
[2] Shandong Technol & Business Univ, Sch Informat & Elect Engn, Yantai, Peoples R China
基金
中国国家自然科学基金;
关键词
Denoising diffusion model; Text attention mechanism; Guided feature enhancement; Medical image segmentation; CONVOLUTIONAL NEURAL-NETWORK; CELL-NUCLEI; MISDIAGNOSIS; CLASSIFICATION;
D O I
10.1016/j.eswa.2024.123549
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, denoising diffusion models have achieved remarkable success in generating pixel-level representations with semantic values for image generation modeling. In this study, we propose a novel end -toend framework, called TGEDiff, focusing on medical image segmentation. TGEDiff fuses a textual attention mechanism with the diffusion model by introducing an additional auxiliary categorization task to guide the diffusion model with textual information to generate excellent pixel-level representations. To overcome the limitation of limited perceptual fields for independent feature encoders within the diffusion model, we introduce a multi-kernel excitation module to extend the model's perceptual capability. Meanwhile, a guided feature enhancement module is introduced in Denoising-UNet to focus the model's attention on important regions and attenuate the influence of noise and irrelevant background in medical images. We critically evaluated TGEDiff on three datasets (Kvasir-SEG, Kvasir-Sessile, and GLaS), and TGEDiff achieved significant improvements over the state -of -the -art approach on all three datasets, with F1 scores and mIoU improving by 0.88% and 1.09%, 3.21% and 3.43%, respectively, 1.29% and 2.34%. These data validate that TGEDiff has excellent performance in medical image segmentation. TGEDiff is expected to facilitate accurate diagnosis and treatment of medical diseases through more precise deconvolutional structural segmentation.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Manipulation Direction: Evaluating Text-Guided Image Manipulation Based on Similarity between Changes in Image and Text Modalities
    Watanabe, Yuto
    Togo, Ren
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    SENSORS, 2023, 23 (22)
  • [42] Text-Guided Human Image Manipulation via Image-Text Shared Space
    Xu, Xiaogang
    Chen, Ying-Cong
    Tao, Xin
    Jia, Jiaya
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 6486 - 6500
  • [43] Medical image segmentation based on the diffusion equation and MRF model
    Li, Yibing
    Zhu, Yao
    Ye, Fang
    Journal of Information and Computational Science, 2014, 11 (05): : 1471 - 1478
  • [44] Learning semantic alignment from image for text-guided image inpainting
    Xie, Yucheng
    Lin, Zehang
    Yang, Zhenguo
    Deng, Huan
    Wu, Xingcai
    Mao, Xudong
    Li, Qing
    Liu, Wenyin
    VISUAL COMPUTER, 2022, 38 (9-10): : 3149 - 3161
  • [45] Text-Guided Knowledge Transfer for Remote Sensing Image-Text Retrieval
    Liu, An-An
    Yang, Bo
    Li, Wenhui
    Song, Dan
    Sun, Zhengya
    Ren, Tongwei
    Wei, Zhiqiang
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [46] Adversarial Learning with Mask Reconstruction for Text-Guided Image Inpainting
    Wu, Xingcai
    Xie, Yucheng
    Zeng, Jiaqi
    Yang, Zhenguo
    Yu, Yi
    Li, Qing
    Liu, Wenyin
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3464 - 3472
  • [47] MMFL: Multimodal Fusion Learning for Text-Guided Image Inpainting
    Lin, Qing
    Yan, Bo
    Li, Jichun
    Tan, Weimin
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1094 - 1102
  • [48] Visual Analytics for model-based medical image segmentation: Opportunities and challenges
    von Landesberger, Tatiana
    Bremm, Sebastian
    Kirschner, Matthias
    Wesarg, Stefan
    Kuijper, Arjan
    EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (12) : 4934 - 4943
  • [49] StyleMC: Multi-Channel Based Fast Text-Guided Image Generation and Manipulation
    Kocasari, Umut
    Dirik, Alara
    Tiftikci, Mert
    Yanardag, Pinar
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3441 - 3450
  • [50] Geometric model-based segmentation of the prostate and surrounding structures for image guided radiotherapy
    Tang, XL
    Jeong, YW
    Radke, RJ
    Chen, GTY
    VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2004, PTS 1 AND 2, 2004, 5308 : 168 - 176