Can Segmentation Models Be Trained with Fully Synthetically Generated Data?

被引:20
|
作者
Fernandez, Virginia [1 ]
Pinaya, Walter Hugo Lopez [1 ]
Borges, Pedro [1 ]
Tudosiu, Petru-Daniel [1 ]
Graham, Mark S. [1 ]
Vercauteren, Tom [1 ]
Cardoso, M. Jorge [1 ]
机构
[1] Kings Coll London, London WC2R 2LS, England
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1007/978-3-031-16980-9_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In order to achieve good performance and generalisability, medical image segmentation models should be trained on sizeable datasets with sufficient variability. Due to ethics and governance restrictions, and the costs associated with labelling data, scientific development is often stifled, with models trained and tested on limited data. Data augmentation is often used to artificially increase the variability in the data distribution and improve model generalisability. Recent works have explored deep generative models for image synthesis, as such an approach would enable the generation of an effectively infinite amount of varied data, addressing the generalisability and data access problems. However, many proposed solutions limit the user's control over what is generated. In this work, we propose brainSPADE, a model which combines a synthetic diffusion-based label generator with a semantic image generator. Our model can produce fully synthetic brain labels on-demand, with or without pathology of interest, and then generate a corresponding MRI image of an arbitrary guided style. Experiments show that brainSPADE synthetic data can be used to train segmentation models with performance comparable to that of models trained on real data.
引用
收藏
页码:79 / 90
页数:12
相关论文
共 50 条
  • [41] Can generative AI replace immunofluorescent staining processes? A comparison study of synthetically generated cellpainting images from brightfield
    Xing, Xiaodan
    Murdoch, Siofra
    Tang, Chunling
    Papanastasiou, Giorgos
    Cross-Zamirski, Jan
    Guo, Yunzhe
    Xiao, Xianglu
    Schönlieb, Carola-Bibiane
    Wang, Yinhai
    Yang, Guang
    [J]. Computers in Biology and Medicine, 2024, 182
  • [42] Of Course, Data Can Never Fully Represent Reality
    Duarte, Marisa Elena
    Vigil-Hayes, Morgan
    Littletree, Sandra
    Belarde-Lewis, Miranda
    [J]. HUMAN BIOLOGY, 2019, 91 (03) : 163 - 178
  • [43] Probabilistic Hazard Models Trained On Harmonized Chemical Relationship Data
    Luechtefeld, T.
    [J]. TOXICOLOGY LETTERS, 2023, 384 : S5 - S5
  • [44] Markov models of viral evolution trained on clinical patient data
    Belew, R. K.
    Looney, D. J.
    Wong, J. K.
    [J]. ANTIVIRAL THERAPY, 2007, 12 (05) : S166 - S166
  • [45] Markov models of viral evolution trained on clinical patient data
    Belew, R. K.
    Looney, D. J.
    Wong, J. K.
    [J]. ANTIVIRAL THERAPY, 2007, 12 : S166 - S166
  • [46] Segmentation of evolving complex data and generation of models
    Loglisci, Corrado
    Berardi, Margherita
    [J]. ICDM 2006: SIXTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, WORKSHOPS, 2006, : 269 - +
  • [47] The system performance of autonomous photovoltaic-wind hybrid energy systems using synthetically generated weather data
    Celik, AN
    [J]. RENEWABLE ENERGY, 2002, 27 (01) : 107 - 121
  • [48] The benefit of synthetically generated RapidEye and Landsat 8 data fusion time series for riparian forest disturbance monitoring
    Gaertner, Philipp
    Foerster, Michael
    Kleinschmit, Birgit
    [J]. REMOTE SENSING OF ENVIRONMENT, 2016, 177 : 237 - 247
  • [49] TensorMixup Data Augmentation Method for Fully Automatic Brain Tumor Segmentation
    Yu, Wang
    Ji Yarong
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 4615 - 4621
  • [50] An FAQ Search Method Using a Document Classifier Trained with Automatically Generated Training Data
    Makino, Takuya
    Noro, Tomoya
    Iwakura, Tomoya
    [J]. PRICAI 2016: TRENDS IN ARTIFICIAL INTELLIGENCE, 2016, 9810 : 295 - 305