Diverter transformer-based multi-encoder-multi-decoder network model for medical retinal blood vessel image segmentation

被引:0
|
作者
Wu, Chengwei [1 ]
Guo, Min [1 ]
Ma, Miao [1 ]
Wang, Kaiguang [1 ]
机构
[1] Shaanxi Normal Univ, Sch Comp Sci, Minist Educ, Key Lab Modern Teaching Technol, Xian 710119, Peoples R China
关键词
Medical image processing; Encoder-decoder architecture; Local context; Retinal vessel segmentation; U-NET;
D O I
10.1016/j.bspc.2024.106132
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
The retinal blood vessel is an essential part of the fundus structure. It is important to accurately analyze the structure and distribution of retinal vessels, which can help make accurate medical diagnoses. However, it is still challenging to extract detailed information due to the problems of fuzzy edges, low resolution, and lots of noise in retinal blood vessel medical images. To extract the image detail information effectively, we propose a new diverter transformer -based multi -encoder -multi -decoder network model in this paper. The network model consists of a feature encoder module and a feature decoder module. Among them, the feature encoding module consists of a diverter transformer with a diverter adaptive mechanism, three encoder units with a convolution layer and max -pooling layer, and the two decoder units in the feature decoding module consist of an inverse convolution layer and an up -sampling layer, respectively. The Local Context Module (LCNet Module) in the feature encoding module learns richer local context feature information layer by layer through changing the width of the network while downsampling; the Global Encoder Module1 (G -Encoder Module1) and the Global Encoder Module2 (G -Encoder Module2) extract the global feature representation of retinal blood vessel images by performing a max -pooling operation to transform the input data into a vector of fixed dimensions, thus helping the network model to better understand and extract the global feature representation of retinal blood vessel images. The two decoder units in the feature decoding module receive local and global feature information from three encoder units, LCNet Module, G -Encoder Module1 and G -Encoder Module2, respectively. Decoder Module1 generates segmentation prediction by layer -by -layer up -sampling operation, and Decoder Module2 recovers the feature information by downsampling and decoding operations and fuses the recovered feature information to output, obtaining the final segmentation of the retinal blood vessels. The proposed diverter transformer -based multi -encoder -multi -decoder network model is validated on the DRIVE and STARE datasets with other classical and state-of-the-art network models, and its segmentation accuracy is 97.25% and 97.93%, respectively. Compared with the classical U -Net model, the improvement is 2.24% and 1.42%, respectively. Compared with the state-of-the-art SPNet model, the accuracy is increased by 0.61% on DRIVE and 1.01% on STARE. It indicates that the network model proposed in this paper has a significant competitive advantage in improving the segmentation performance of retinal blood vessel images.
引用
收藏
页数:24
相关论文
共 50 条
  • [1] MULTI-ENCODER PARSE-DECODER NETWORK FOR SEQUENTIAL MEDICAL IMAGE SEGMENTATION
    Shi, Dachuan
    Liu, Ruiyang
    Tao, Linmi
    He, Zuoxiang
    Huo, Li
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 31 - 35
  • [2] Encoder Activation Diffusion and Decoder Transformer Fusion Network for Medical Image Segmentation
    Li, Xueru
    Xu, Guoxia
    Zhao, Meng
    Shi, Fan
    Wang, Hao
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XIII, 2024, 14437 : 185 - 197
  • [3] Medical image super-resolution via transformer-based hierarchical encoder-decoder network
    Sun, Jianhao
    Zeng, Xiangqin
    Lei, Xiang
    Gao, Mingliang
    Li, Qilei
    Zhang, Housheng
    Ba, Fengli
    [J]. NETWORK MODELING AND ANALYSIS IN HEALTH INFORMATICS AND BIOINFORMATICS, 2024, 13 (01):
  • [4] Retinal vessel image segmentation algorithm based on encoder-decoder structure
    Zhai, ZhengLi
    Feng, Shu
    Yao, Luyao
    Li, Penghui
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (23) : 33361 - 33373
  • [5] Retinal vessel image segmentation algorithm based on encoder-decoder structure
    ZhengLi Zhai
    Shu Feng
    Luyao Yao
    Penghui Li
    [J]. Multimedia Tools and Applications, 2022, 81 : 33361 - 33373
  • [6] MUSTER: A Multi-Scale Transformer-Based Decoder for Semantic Segmentation
    Xu, Jing
    Shi, Wentao
    Gao, Pan
    Li, Qizhu
    Wang, Zhengwei
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024,
  • [7] PMED-Net: Pyramid Based Multi-Scale Encoder-Decoder Network for Medical Image Segmentation
    Khan, Abbas
    Kim, Hyongsuk
    Chua, Leon
    [J]. IEEE ACCESS, 2021, 9 : 55988 - 55998
  • [8] A Transformer-based Cascade Network with Boundary Enhancement Loss for Retinal Vessel Segmentation
    Cai, Binke
    Ma, Liyan
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 4292 - 4298
  • [9] A Transformer-Based Decoder for Semantic Segmentation with Multi-level Context Mining
    Shi, Bowen
    Jiang, Dongsheng
    Zhang, Xiaopeng
    Li, Han
    Dai, Wenrui
    Zou, Junni
    Xiong, Hongkai
    Tian, Qi
    [J]. COMPUTER VISION - ECCV 2022, PT XXVIII, 2022, 13688 : 624 - 639
  • [10] Transformer-based Automatic Post-Editing Model with Joint Encoder and Multi-source Attention of Decoder
    Lee, WonKee
    Shin, Jaehun
    Lee, Jong-Hyeok
    [J]. FOURTH CONFERENCE ON MACHINE TRANSLATION (WMT 2019), VOL 3: SHARED TASK PAPERS, DAY 2, 2019, : 112 - 117