Diffusion model-based text-guided enhancement network for medical image segmentation

被引:1
|
作者
Dong, Zhiwei [1 ]
Yuan, Genji [1 ]
Hua, Zhen [1 ]
Li, Jinjiang [2 ]
机构
[1] Shandong Technol & Business Univ, Sch Comp Sci & Technol, Yantai, Peoples R China
[2] Shandong Technol & Business Univ, Sch Informat & Elect Engn, Yantai, Peoples R China
基金
中国国家自然科学基金;
关键词
Denoising diffusion model; Text attention mechanism; Guided feature enhancement; Medical image segmentation; CONVOLUTIONAL NEURAL-NETWORK; CELL-NUCLEI; MISDIAGNOSIS; CLASSIFICATION;
D O I
10.1016/j.eswa.2024.123549
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, denoising diffusion models have achieved remarkable success in generating pixel-level representations with semantic values for image generation modeling. In this study, we propose a novel end -toend framework, called TGEDiff, focusing on medical image segmentation. TGEDiff fuses a textual attention mechanism with the diffusion model by introducing an additional auxiliary categorization task to guide the diffusion model with textual information to generate excellent pixel-level representations. To overcome the limitation of limited perceptual fields for independent feature encoders within the diffusion model, we introduce a multi-kernel excitation module to extend the model's perceptual capability. Meanwhile, a guided feature enhancement module is introduced in Denoising-UNet to focus the model's attention on important regions and attenuate the influence of noise and irrelevant background in medical images. We critically evaluated TGEDiff on three datasets (Kvasir-SEG, Kvasir-Sessile, and GLaS), and TGEDiff achieved significant improvements over the state -of -the -art approach on all three datasets, with F1 scores and mIoU improving by 0.88% and 1.09%, 3.21% and 3.43%, respectively, 1.29% and 2.34%. These data validate that TGEDiff has excellent performance in medical image segmentation. TGEDiff is expected to facilitate accurate diagnosis and treatment of medical diseases through more precise deconvolutional structural segmentation.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Text-Guided Customizable Image Synthesis and Manipulation
    Zhang, Zhiqiang
    Fu, Chen
    Weng, Wei
    Zhou, Jinjia
    APPLIED SCIENCES-BASEL, 2022, 12 (20):
  • [22] TIC: text-guided image colorization using conditional generative model
    Ghosh, Subhankar
    Roy, Prasun
    Bhattacharya, Saumik
    Pal, Umapada
    Blumenstein, Michael
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (14) : 41121 - 41136
  • [23] TIC: text-guided image colorization using conditional generative model
    Subhankar Ghosh
    Prasun Roy
    Saumik Bhattacharya
    Umapada Pal
    Michael Blumenstein
    Multimedia Tools and Applications, 2024, 83 : 41121 - 41136
  • [24] GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
    Nichol, Alex
    Dhariwal, Prafulla
    Ramesh, Aditya
    Shyam, Pranav
    Mishkin, Pamela
    McGrew, Bob
    Sutskever, Ilya
    Chen, Mark
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [25] Controlling Attention Map Better for Text-Guided Image Editing Diffusion Models
    Xu, Siqi
    Sun, Lijun
    Liu, Guanming
    Wei, Zhihua
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XII, ICIC 2024, 2024, 14873 : 54 - 65
  • [26] TGANet: Text-Guided Attention for Improved Polyp Segmentation
    Tomar, Nikhil Kumar
    Jha, Debesh
    Bagci, Ulas
    Ali, Sharib
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT III, 2022, 13433 : 151 - 160
  • [27] MaskDiffuse: Text-Guided Face Mask Removal Based on Diffusion Models
    Lu, Jingxia
    Hou, Xianxu
    Li, Hao
    Peng, Zhibin
    Shen, Linlin
    Fan, Lixin
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI, 2024, 14430 : 435 - 446
  • [28] Gaussian model-based statistical matching for image enhancement and segmentation
    Zheng, Yufeng
    VISUAL INFORMATION PROCESSING XVII, 2008, 6978
  • [29] Text-Guided Neural Network Training for Image Recognition in Natural Scenes and Medicine
    Zhang, Zizhao
    Chen, Pingjun
    Shi, Xiaoshuang
    Yang, Lin
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (05) : 1733 - 1745
  • [30] Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer
    Yang, Serin
    Hwang, Hyunmin
    Ye, Jong Chul
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22816 - 22825