TUMSyn: A Text-Guided Generalist Model for Customized Multimodal MR Image Synthesis

被引:0
|
作者
Wang, Yulin [1 ,2 ,3 ,4 ]
Xiong, Honglin [3 ,4 ]
Xie, Yi [3 ,4 ]
Liu, Jiameng [3 ,4 ]
Wang, Qian [3 ,4 ,5 ]
Liu, Qian [1 ,2 ,6 ]
Shen, Dinggang [3 ,4 ,5 ]
机构
[1] Hainan Univ, Sch Biomed Engn, Haikou 570228, Hainan, Peoples R China
[2] Hainan Univ, State Key Lab Digital Med Engn, Haikou 570228, Hainan, Peoples R China
[3] ShanghaiTech Univ, Sch Biomed Engn, Shanghai 201210, Peoples R China
[4] ShanghaiTech Univ, State Key Lab Adv Med Mat & Devices, Shanghai 201210, Peoples R China
[5] Shanghai Clin Res & Trial Ctr, Shanghai 201210, Peoples R China
[6] Hainan Univ, Hlth Inst 1, Key Lab Biomed Engn Hainan Prov, Haikou 570228, Hainan, Peoples R China
基金
中国国家自然科学基金;
关键词
Foundation Model; Multimodal MRI; MRI Synthesis; Super-resolution; FOUNDATION MODELS;
D O I
10.1007/978-3-031-73471-7_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal magnetic resonance (MR) imaging has revolutionized our understanding of the human brain. However, various limitations in clinical scanning hinder the data acquisition process. Current medical image synthesis techniques, often designed for specific tasks or modalities, exhibit diminished performance when confronted with heterogeneous-source MRI data. Here we introduce a Text-guided Universal MR image Synthesis (TUMSyn) generalist model to generate text-specified multimodal brain MRI sequences from any real-acquired sequences. By leveraging demographic data and imaging parameters as text prompts, TUMSyn achieves diverse cross-sequence synthesis tasks using a unified model. To enhance the efficacy of text features in steering synthesis, we pre-train a text encoder by using contrastive learning strategy to align and fuse image and text semantic information. Developed and evaluated on a multi-center dataset of over 20K brain MR image-text pairs with 7 structural MR contrasts, spanning almost entire age spectrum and various physical conditions, TUMSyn demonstrates comparable or exceeding performance compared to task-specific methods in both supervised and zero-shot settings, and the synthesized images exhibit accurate anatomical morphology suitable for various downstream clinical-related tasks. In summary, by incorporating text metadata into the image synthesis, the accuracy, versatility, and generalizability position TUMSyn as a powerful augmentative tool for conventional MRI systems, offering rapid and cost-effective acquisition of multi-sequence MR images for clinical and research applications.
引用
收藏
页码:124 / 133
页数:10
相关论文
共 50 条
  • [31] Adversarial Learning with Mask Reconstruction for Text-Guided Image Inpainting
    Wu, Xingcai
    Xie, Yucheng
    Zeng, Jiaqi
    Yang, Zhenguo
    Yu, Yi
    Li, Qing
    Liu, Wenyin
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3464 - 3472
  • [32] Diffusion model-based text-guided enhancement network for medical image segmentation
    Dong, Zhiwei
    Yuan, Genji
    Hua, Zhen
    Li, Jinjiang
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [33] LivePhoto: Real Image Animation with Text-Guided Motion Control
    Chen, Xi
    Liu, Zhiheng
    Chen, Mengting
    Feng, Yutong
    Liu, Yu
    Shen, Yujun
    Zhao, Hengshuang
    COMPUTER VISION-ECCV 2024, PT XVIII, 2025, 15076 : 475 - 491
  • [34] Dilated Residual Aggregation Network for Text-Guided Image Manipulation
    Lu, Siwei
    Luo, Di
    Yang, Zhenguo
    Hao, Tianyong
    Li, Qing
    Liu, Wenyin
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT III, 2021, 12893 : 28 - 40
  • [35] TediGAN: Text-Guided Diverse Face Image Generation and Manipulation
    Xia, Weihao
    Yang, Yujiu
    Xue, Jing-Hao
    Wu, Baoyuan
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 2256 - 2265
  • [36] Lightweight Generative Adversarial Networks for Text-Guided Image Manipulation
    Li, Bowen
    Qi, Xiaojuan
    Torr, Philip H. S.
    Lukasiewicz, Thomas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [37] Text-Guided Mask-Free Local Image Retouching
    Liu, Zerun
    Zhang, Fan
    He, Jingxuan
    Wang, Jin
    Wang, Zhangye
    Cheng, Lechao
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2783 - 2788
  • [38] Text-Guided Diverse Image Synthesis for Long-Tailed Remote Sensing Object Classification
    Tang, Haojun
    Zhao, Wenda
    Hu, Guang
    Xiao, Yi
    Li, Yunlong
    Wang, Haipeng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [39] Text-Guided Generative Adversarial Network for Image Emotion Transfer
    Zhu, Siqi
    Qing, Chunmei
    Xu, Xiangmin
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT II, 2023, 14087 : 506 - 522
  • [40] DiffusionCLIP Text-Guided Diffusion Models for Robust Image Manipulation
    Kim, Gwanghyun
    Kwon, Taesung
    Ye, Jong Chul
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 2416 - 2425