TUMSyn: A Text-Guided Generalist Model for Customized Multimodal MR Image Synthesis

被引:0
|
作者
Wang, Yulin [1 ,2 ,3 ,4 ]
Xiong, Honglin [3 ,4 ]
Xie, Yi [3 ,4 ]
Liu, Jiameng [3 ,4 ]
Wang, Qian [3 ,4 ,5 ]
Liu, Qian [1 ,2 ,6 ]
Shen, Dinggang [3 ,4 ,5 ]
机构
[1] Hainan Univ, Sch Biomed Engn, Haikou 570228, Hainan, Peoples R China
[2] Hainan Univ, State Key Lab Digital Med Engn, Haikou 570228, Hainan, Peoples R China
[3] ShanghaiTech Univ, Sch Biomed Engn, Shanghai 201210, Peoples R China
[4] ShanghaiTech Univ, State Key Lab Adv Med Mat & Devices, Shanghai 201210, Peoples R China
[5] Shanghai Clin Res & Trial Ctr, Shanghai 201210, Peoples R China
[6] Hainan Univ, Hlth Inst 1, Key Lab Biomed Engn Hainan Prov, Haikou 570228, Hainan, Peoples R China
基金
中国国家自然科学基金;
关键词
Foundation Model; Multimodal MRI; MRI Synthesis; Super-resolution; FOUNDATION MODELS;
D O I
10.1007/978-3-031-73471-7_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal magnetic resonance (MR) imaging has revolutionized our understanding of the human brain. However, various limitations in clinical scanning hinder the data acquisition process. Current medical image synthesis techniques, often designed for specific tasks or modalities, exhibit diminished performance when confronted with heterogeneous-source MRI data. Here we introduce a Text-guided Universal MR image Synthesis (TUMSyn) generalist model to generate text-specified multimodal brain MRI sequences from any real-acquired sequences. By leveraging demographic data and imaging parameters as text prompts, TUMSyn achieves diverse cross-sequence synthesis tasks using a unified model. To enhance the efficacy of text features in steering synthesis, we pre-train a text encoder by using contrastive learning strategy to align and fuse image and text semantic information. Developed and evaluated on a multi-center dataset of over 20K brain MR image-text pairs with 7 structural MR contrasts, spanning almost entire age spectrum and various physical conditions, TUMSyn demonstrates comparable or exceeding performance compared to task-specific methods in both supervised and zero-shot settings, and the synthesized images exhibit accurate anatomical morphology suitable for various downstream clinical-related tasks. In summary, by incorporating text metadata into the image synthesis, the accuracy, versatility, and generalizability position TUMSyn as a powerful augmentative tool for conventional MRI systems, offering rapid and cost-effective acquisition of multi-sequence MR images for clinical and research applications.
引用
收藏
页码:124 / 133
页数:10
相关论文
共 50 条
  • [21] Hardware Resilience Properties of Text-Guided Image Classifiers
    Wasim, Syed Talal
    Soboka, Kabila Haile
    Mahmoud, Abdulrahman
    Khan, Salman
    Brooks, David
    Wei, Gu-Yeon
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [22] FocusGAN: Preserving Background in Text-Guided Image Editing
    Zhao, Liuqing
    Li, Linyan
    Hu, Fuyuan
    Xia, Zhenping
    Yao, Rui
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2021, 35 (16)
  • [23] Target-Free Text-Guided Image Manipulation
    Fan, Wan-Cyuan
    Yang, Cheng-Fu
    Yang, Chiao-An
    Wang, Yu-Chiang Frank
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 588 - 596
  • [24] SEGMENTATION-AWARE TEXT-GUIDED IMAGE MANIPULATION
    Haruyama, Tomoki
    Togo, Ren
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2433 - 2437
  • [25] Text-guided Image Restoration and Semantic Enhancement for Text-to-Image Person Retrieval
    Liu, Delong
    Li, Haiwen
    Zhao, Zhicheng
    Dong, Yuan
    NEURAL NETWORKS, 2025, 184
  • [26] Text-Guided Human Image Manipulation via Image-Text Shared Space
    Xu, Xiaogang
    Chen, Ying-Cong
    Tao, Xin
    Jia, Jiaya
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 6486 - 6500
  • [27] GroundingCarDD: Text-Guided Multimodal Phrase Grounding for Car Damage Detection
    Hasan, Md Jahid
    Nalwan, Agustinus
    Ong, Kok-Leong
    Jahani, Hamed
    Boo, Yee Ling
    Nguyen, Kha Cong
    Hasan, Mahmudul
    IEEE ACCESS, 2024, 12 : 179464 - 179477
  • [28] Text-Guided Knowledge Transfer for Remote Sensing Image-Text Retrieval
    Liu, An-An
    Yang, Bo
    Li, Wenhui
    Song, Dan
    Sun, Zhengya
    Ren, Tongwei
    Wei, Zhiqiang
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [29] Learning semantic alignment from image for text-guided image inpainting
    Xie, Yucheng
    Lin, Zehang
    Yang, Zhenguo
    Deng, Huan
    Wu, Xingcai
    Mao, Xudong
    Li, Qing
    Liu, Wenyin
    VISUAL COMPUTER, 2022, 38 (9-10): : 3149 - 3161
  • [30] Text-Guided Multi-region Scene Image Editing Based on Diffusion Model
    Li, Ruichen
    Wu, Lei
    Wang, Changshuo
    Dong, Pei
    Li, Xin
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XI, ICIC 2024, 2024, 14872 : 229 - 240