Text-Guided Cross-Position Attention for Segmentation: Case of Medical Image

被引:1
|
作者
Lee, Go-Eun [1 ]
Kim, Seon Ho [2 ]
Cho, Jungchan [3 ]
Choi, Sang Tae [4 ]
Choi, Sang-Il [1 ]
机构
[1] Dankook Univ, Yongin, Gyeonggi Do, South Korea
[2] Univ Southern Calif, Los Angeles, CA 90007 USA
[3] Gachon Univ, Seongnam, Gyeonggi Do, South Korea
[4] Chung Ang Univ, Coll Med, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Image Segmentation; Multi Modal Learning; Cross Position Attention; Text-Guided Attention; Medical Image; TRANSFORMER;
D O I
10.1007/978-3-031-43904-9_52
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel text-guided cross-position attention module which aims at applying a multi-modality of text and image to position attention in medical image segmentation. To match the dimension of the text feature to that of the image feature map, we multiply learnable parameters by text features and combine the multi-modal semantics via cross-attention. It allows a model to learn the dependency between various characteristics of text and image. Our proposed model demonstrates superior performance compared to other medical models using image-only data or image-text data. Furthermore, we utilize our module as a region of interest (RoI) generator to classify the inflammation of the sacroiliac joints. The RoIs obtained from the model contribute to improve the performance of classification models.
引用
收藏
页码:537 / 546
页数:10
相关论文
共 50 条
  • [1] MCPANet: Multiscale Cross-Position Attention Network for Retinal Vessel Image Segmentation
    Jiang, Yun
    Liang, Jing
    Cheng, Tongtong
    Zhang, Yuan
    Lin, Xin
    Dong, Jinkun
    [J]. SYMMETRY-BASEL, 2022, 14 (07):
  • [2] Text-Guided Attention Model for Image Captioning
    Mun, Jonghwan
    Cho, Minsu
    Han, Bohyung
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4233 - 4239
  • [3] TGANet: Text-Guided Attention for Improved Polyp Segmentation
    Tomar, Nikhil Kumar
    Jha, Debesh
    Bagci, Ulas
    Ali, Sharib
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT III, 2022, 13433 : 151 - 160
  • [4] Enhanced Text-Guided Attention Model for Image Captioning
    Zhou, Yuanen
    Hu, Zhenzhen
    Zhao, Ye
    Liu, Xueliang
    Hong, Richang
    [J]. 2018 IEEE FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2018,
  • [5] SEGMENTATION-AWARE TEXT-GUIDED IMAGE MANIPULATION
    Haruyama, Tomoki
    Togo, Ren
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2433 - 2437
  • [6] Diffusion model-based text-guided enhancement network for medical image segmentation
    Dong, Zhiwei
    Yuan, Genji
    Hua, Zhen
    Li, Jinjiang
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [7] Text-Guided Image Inpainting
    Zhang, Zijian
    Zhao, Zhou
    Zhang, Zhu
    Huai, Baoxing
    Yuan, Jing
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 4079 - 4087
  • [8] Text-guided Attention Mechanism Fine-grained Image Classification
    Yang, Xinglin
    Pan, Heng
    [J]. 2022 THE 6TH INTERNATIONAL CONFERENCE ON VIRTUAL AND AUGMENTED REALITY SIMULATIONS, ICVARS 2022, 2022, : 45 - 49
  • [9] GENERATIVE ADVERSARIAL NETWORK INCLUDING REFERRING IMAGE SEGMENTATION FOR TEXT-GUIDED IMAGE MANIPULATION
    Watanabe, Yuto
    Togo, Ren
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4818 - 4822
  • [10] Text-guided visual representation learning for medical image retrieval systems
    Serieys, Guillaume
    Kurtz, Camille
    Fournier, Laure
    Cloppet, Florence
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 593 - 598