TUMSyn: A Text-Guided Generalist Model for Customized Multimodal MR Image Synthesis

被引:0
|
作者
Wang, Yulin [1 ,2 ,3 ,4 ]
Xiong, Honglin [3 ,4 ]
Xie, Yi [3 ,4 ]
Liu, Jiameng [3 ,4 ]
Wang, Qian [3 ,4 ,5 ]
Liu, Qian [1 ,2 ,6 ]
Shen, Dinggang [3 ,4 ,5 ]
机构
[1] Hainan Univ, Sch Biomed Engn, Haikou 570228, Hainan, Peoples R China
[2] Hainan Univ, State Key Lab Digital Med Engn, Haikou 570228, Hainan, Peoples R China
[3] ShanghaiTech Univ, Sch Biomed Engn, Shanghai 201210, Peoples R China
[4] ShanghaiTech Univ, State Key Lab Adv Med Mat & Devices, Shanghai 201210, Peoples R China
[5] Shanghai Clin Res & Trial Ctr, Shanghai 201210, Peoples R China
[6] Hainan Univ, Hlth Inst 1, Key Lab Biomed Engn Hainan Prov, Haikou 570228, Hainan, Peoples R China
基金
中国国家自然科学基金;
关键词
Foundation Model; Multimodal MRI; MRI Synthesis; Super-resolution; FOUNDATION MODELS;
D O I
10.1007/978-3-031-73471-7_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal magnetic resonance (MR) imaging has revolutionized our understanding of the human brain. However, various limitations in clinical scanning hinder the data acquisition process. Current medical image synthesis techniques, often designed for specific tasks or modalities, exhibit diminished performance when confronted with heterogeneous-source MRI data. Here we introduce a Text-guided Universal MR image Synthesis (TUMSyn) generalist model to generate text-specified multimodal brain MRI sequences from any real-acquired sequences. By leveraging demographic data and imaging parameters as text prompts, TUMSyn achieves diverse cross-sequence synthesis tasks using a unified model. To enhance the efficacy of text features in steering synthesis, we pre-train a text encoder by using contrastive learning strategy to align and fuse image and text semantic information. Developed and evaluated on a multi-center dataset of over 20K brain MR image-text pairs with 7 structural MR contrasts, spanning almost entire age spectrum and various physical conditions, TUMSyn demonstrates comparable or exceeding performance compared to task-specific methods in both supervised and zero-shot settings, and the synthesized images exhibit accurate anatomical morphology suitable for various downstream clinical-related tasks. In summary, by incorporating text metadata into the image synthesis, the accuracy, versatility, and generalizability position TUMSyn as a powerful augmentative tool for conventional MRI systems, offering rapid and cost-effective acquisition of multi-sequence MR images for clinical and research applications.
引用
收藏
页码:124 / 133
页数:10
相关论文
共 50 条
  • [1] Text-Guided Customizable Image Synthesis and Manipulation
    Zhang, Zhiqiang
    Fu, Chen
    Weng, Wei
    Zhou, Jinjia
    APPLIED SCIENCES-BASEL, 2022, 12 (20):
  • [2] MMFL: Multimodal Fusion Learning for Text-Guided Image Inpainting
    Lin, Qing
    Yan, Bo
    Li, Jichun
    Tan, Weimin
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1094 - 1102
  • [3] Text-Guided Attention Model for Image Captioning
    Mun, Jonghwan
    Cho, Minsu
    Han, Bohyung
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4233 - 4239
  • [4] Text-Guided Sketch-to-Photo Image Synthesis
    Osahor, Uche
    Nasrabadi, Nasser M.
    IEEE ACCESS, 2022, 10 : 98278 - 98289
  • [5] Text-Guided Image Inpainting
    Zhang, Zijian
    Zhao, Zhou
    Zhang, Zhu
    Huai, Baoxing
    Yuan, Jing
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 4079 - 4087
  • [6] A Text-Guided Generation and Refinement Model for Image Captioning
    Wang, Depeng
    Hu, Zhenzhen
    Zhou, Yuanen
    Hong, Richang
    Wang, Meng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2966 - 2977
  • [7] Enhanced Text-Guided Attention Model for Image Captioning
    Zhou, Yuanen
    Hu, Zhenzhen
    Zhao, Ye
    Liu, Xueliang
    Hong, Richang
    2018 IEEE FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2018,
  • [8] Text-Guided Neural Image Inpainting
    Zhang, Lisai
    Chen, Qingcai
    Hu, Baotian
    Jiang, Shuoran
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1302 - 1310
  • [9] Text-Guided Portrait Image Matting
    Xu Y.
    Yao X.
    Liu B.
    Quan Y.
    Ji H.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (08): : 1 - 13
  • [10] Text-Guided Foundation Model Adaptation for Pathological Image Classification
    Zhang, Yunkun
    Gao, Jin
    Zhou, Mu
    Wang, Xiaosong
    Qiao, Yu
    Zhang, Shaoting
    Wang, Dequan
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT V, 2023, 14224 : 272 - 282