Vision Mamba and xLSTM-UNet for medical image segmentation

被引:0
|
作者
Xin Zhong [1 ]
Gehao Lu [1 ]
Hao Li [1 ]
机构
[1] Yunnan University,School of Information Science and Engineering
关键词
Deep Learning; Medical Image Segmentation; SSM; XLSTM;
D O I
10.1038/s41598-025-88967-5
中图分类号
学科分类号
摘要
Deep learning-based medical image segmentation methods are generally divided into convolutional neural networks (CNNs) and Transformer-based models. Traditional CNNs are limited by their receptive field, making it challenging to capture long-range dependencies. While Transformers excel at modeling global information, their high computational complexity restricts their practical application in clinical scenarios. To address these limitations, this study introduces VMAXL-UNet, a novel segmentation network that integrates Structured State Space Models (SSM) and lightweight LSTMs (xLSTM). The network incorporates Visual State Space (VSS) and ViL modules in the encoder to efficiently fuse local boundary details with global semantic context. The VSS module leverages SSM to capture long-range dependencies and extract critical features from distant regions. Meanwhile, the ViL module employs a gating mechanism to enhance the integration of local and global features, thereby improving segmentation accuracy and robustness. Experiments on datasets such as ISIC17, ISIC18, CVC-ClinicDB, and Kvasir demonstrate that VMAXL-UNet significantly outperforms traditional CNNs and Transformer-based models in capturing lesion boundaries and their distant correlations. These results highlight the model’s superior performance and provide a promising approach for efficient segmentation in complex medical imaging scenarios.
引用
收藏
相关论文
共 50 条
  • [21] Swin Unet3D: a three-dimensional medical image segmentation network combining vision transformer and convolution
    Cai, Yimin
    Long, Yuqing
    Han, Zhenggong
    Liu, Mingkun
    Zheng, Yuchen
    Yang, Wei
    Chen, Liming
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2023, 23 (01)
  • [22] Swin Unet3D: a three-dimensional medical image segmentation network combining vision transformer and convolution
    Yimin Cai
    Yuqing Long
    Zhenggong Han
    Mingkun Liu
    Yuchen Zheng
    Wei Yang
    Liming Chen
    BMC Medical Informatics and Decision Making, 23
  • [23] A Comprehensive Exploration of L-UNet Approach: Revolutionizing Medical Image Segmentation
    Alafer F.
    Siddiqi M.H.
    Khan M.S.
    Ahmad I.
    Alhujaili S.
    Alrowaili Z.
    Alshabibi A.S.
    IEEE Access, 2024, 12 : 1 - 1
  • [24] A novel full-convolution UNet-transformer for medical image segmentation
    Zhu, Tianyou
    Ding, Derui
    Wang, Feng
    Liang, Wei
    Wang, Bo
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 89
  • [25] A Medical Image Segmentation Method Based on Improved UNet 3+ Network
    Xu, Yang
    Hou, Shike
    Wang, Xiangyu
    Li, Duo
    Lu, Lu
    DIAGNOSTICS, 2023, 13 (03)
  • [26] UCSwin-UNet model for medical image segmentation based on cardiac haemangioma
    Shi, Jian-Ting
    Qu, Gui-Xu
    Li, Zhi-Jun
    IET IMAGE PROCESSING, 2024,
  • [27] ERDUnet: An Efficient Residual Double-Coding Unet for Medical Image Segmentation
    Li, Hao
    Zhai, Di-Hua
    Xia, Yuanqing
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 2083 - 2096
  • [28] N-Net: an UNet architecture with dual encoder for medical image segmentation
    Liang, Bingtao
    Tang, Chen
    Zhang, Wei
    Xu, Min
    Wu, Tianbo
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (06) : 3073 - 3081
  • [29] Pie-UNet: A Novel Parallel Interaction Encoder for Medical Image Segmentation
    Jiang, Youtao
    Zhang, Xiaoqian
    Chen, Yufeng
    Yang, Shukai
    Sun, Feng
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT II, 2023, 14255 : 558 - 569
  • [30] N-Net: an UNet architecture with dual encoder for medical image segmentation
    Bingtao Liang
    Chen Tang
    Wei Zhang
    Min Xu
    Tianbo Wu
    Signal, Image and Video Processing, 2023, 17 : 3073 - 3081