TUMSyn: A Text-Guided Generalist Model for Customized Multimodal MR Image Synthesis

被引:0
|
作者
Wang, Yulin [1 ,2 ,3 ,4 ]
Xiong, Honglin [3 ,4 ]
Xie, Yi [3 ,4 ]
Liu, Jiameng [3 ,4 ]
Wang, Qian [3 ,4 ,5 ]
Liu, Qian [1 ,2 ,6 ]
Shen, Dinggang [3 ,4 ,5 ]
机构
[1] Hainan Univ, Sch Biomed Engn, Haikou 570228, Hainan, Peoples R China
[2] Hainan Univ, State Key Lab Digital Med Engn, Haikou 570228, Hainan, Peoples R China
[3] ShanghaiTech Univ, Sch Biomed Engn, Shanghai 201210, Peoples R China
[4] ShanghaiTech Univ, State Key Lab Adv Med Mat & Devices, Shanghai 201210, Peoples R China
[5] Shanghai Clin Res & Trial Ctr, Shanghai 201210, Peoples R China
[6] Hainan Univ, Hlth Inst 1, Key Lab Biomed Engn Hainan Prov, Haikou 570228, Hainan, Peoples R China
基金
中国国家自然科学基金;
关键词
Foundation Model; Multimodal MRI; MRI Synthesis; Super-resolution; FOUNDATION MODELS;
D O I
10.1007/978-3-031-73471-7_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal magnetic resonance (MR) imaging has revolutionized our understanding of the human brain. However, various limitations in clinical scanning hinder the data acquisition process. Current medical image synthesis techniques, often designed for specific tasks or modalities, exhibit diminished performance when confronted with heterogeneous-source MRI data. Here we introduce a Text-guided Universal MR image Synthesis (TUMSyn) generalist model to generate text-specified multimodal brain MRI sequences from any real-acquired sequences. By leveraging demographic data and imaging parameters as text prompts, TUMSyn achieves diverse cross-sequence synthesis tasks using a unified model. To enhance the efficacy of text features in steering synthesis, we pre-train a text encoder by using contrastive learning strategy to align and fuse image and text semantic information. Developed and evaluated on a multi-center dataset of over 20K brain MR image-text pairs with 7 structural MR contrasts, spanning almost entire age spectrum and various physical conditions, TUMSyn demonstrates comparable or exceeding performance compared to task-specific methods in both supervised and zero-shot settings, and the synthesized images exhibit accurate anatomical morphology suitable for various downstream clinical-related tasks. In summary, by incorporating text metadata into the image synthesis, the accuracy, versatility, and generalizability position TUMSyn as a powerful augmentative tool for conventional MRI systems, offering rapid and cost-effective acquisition of multi-sequence MR images for clinical and research applications.
引用
收藏
页码:124 / 133
页数:10
相关论文
共 50 条
  • [41] Text-Guided Molecule Generation with Diffusion Language Model
    Gong, Haisong
    Liu, Qiang
    Wu, Shu
    Wang, Liang
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 1, 2024, : 109 - 117
  • [42] TolerantGAN: Text-Guided Image Manipulation Tolerant to Real-World Image
    Watanabe, Yuto
    Togo, Ren
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 150 - 159
  • [43] Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting
    Wang, Su
    Saharia, Chitwan
    Montgomery, Ceslee
    Pont-Tuset, Jordi
    Noy, Shai
    Pellegrini, Stefano
    Onoe, Yasumasa
    Laszlo, Sarah
    Fleet, David J.
    Soricut, Radu
    Baldridge, Jason
    Norouzi, Mohammad
    Anderson, Peter
    Chan, William
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18359 - 18369
  • [44] Text-guided Attention Mechanism Fine-grained Image Classification
    Yang, Xinglin
    Pan, Heng
    2022 THE 6TH INTERNATIONAL CONFERENCE ON VIRTUAL AND AUGMENTED REALITY SIMULATIONS, ICVARS 2022, 2022, : 45 - 49
  • [45] CycleNet: Rethinking Cycle Consistency in Text-Guided Diffusion for Image Manipulation
    Xu, Sihan
    Ma, Ziqiao
    Huang, Yidong
    Lee, Honglak
    Chai, Joyce
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [46] Eliminating the Cross-Domain Misalignment in Text-guided Image Inpainting
    Huang, Muqi
    Wang, Chaoyue
    Lu, Yong
    Zhang, Lefei
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 875 - 883
  • [47] Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing
    Nam, Hyelin
    Kwon, Gihyun
    Park, Geon Yeong
    Ye, Jong Chul
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 9192 - 9201
  • [48] ABP: Asymmetric Bilateral Prompting for Text-Guided Medical Image Segmentation
    Zeng, Xinyi
    Zeng, Pinxian
    Cui, Jiaqi
    Li, Aibing
    Liu, Bo
    Wang, Chengdi
    Wang, Yan
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT IX, 2024, 15009 : 54 - 64
  • [49] MISL: Multi-grained image-text semantic learning for text-guided image inpainting
    Wu, Xingcai
    Zhao, Kejun
    Huang, Qianding
    Wang, Qi
    Yang, Zhenguo
    Hao, Gefei
    PATTERN RECOGNITION, 2024, 145
  • [50] Text-guided visual representation learning for medical image retrieval systems
    Serieys, Guillaume
    Kurtz, Camille
    Fournier, Laure
    Cloppet, Florence
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 593 - 598