High-resolution image reconstruction with latent diffusion models from human brain activity

被引:37
|
作者
Takagi, Yu [1 ,2 ]
Nishimoto, Shinji [1 ,2 ]
机构
[1] Osaka Univ, Grad Sch Frontier Biosci, Suita, Osaka, Japan
[2] NICT, CiNet, Osaka, Japan
关键词
NATURAL IMAGES;
D O I
10.1109/CVPR52729.2023.01389
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reconstructing visual experiences from human brain activity offers a unique way to understand how the brain represents the world, and to interpret the connection between computer vision models and our visual system. While deep generative models have recently been employed for this task, reconstructing realistic images with high semantic fidelity is still a challenging problem. Here, we propose a new method based on a diffusion model (DM) to reconstruct images from human brain activity obtained via functional magnetic resonance imaging (fMRI). More specifically, we rely on a latent diffusion model (LDM) termed Stable Diffusion. This model reduces the computational cost of DMs, while preserving their high generative performance. We also characterize the inner mechanisms of the LDM by studying how its different components (such as the latent vector of image Z, conditioning inputs C, and different elements of the denoising U-Net) relate to distinct brain functions. We show that our proposed method can reconstruct high-resolution images with high fidelity in straight-forward fashion, without the need for any additional training and fine-tuning of complex deep-learning models. We also provide a quantitative interpretation of different LDM components from a neuroscientific perspective. Overall, our study proposes a promising method for reconstructing images from human brain activity, and provides a new framework for understanding DMs. Please check out our webpage at https://sites.google.com/view/stablediffusion-withbrain/.
引用
收藏
页码:14453 / 14463
页数:11
相关论文
共 50 条
  • [1] High-Resolution Image Synthesis with Latent Diffusion Models
    Rombach, Robin
    Blattmann, Andreas
    Lorenz, Dominik
    Esser, Patrick
    Ommer, Bjoern
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10674 - 10685
  • [2] High-resolution Reconstruction of Human Brain MRI Image based on Local Polynomial Regression
    Zhang, Z. G.
    Chan, S. C.
    Zhang, X.
    Lam, E. Y.
    Wu, E. X.
    Hu, Y.
    [J]. 2009 4TH INTERNATIONAL IEEE/EMBS CONFERENCE ON NEURAL ENGINEERING, 2009, : 238 - +
  • [3] Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
    Blattmann, Andreas
    Rombach, Robin
    Ling, Huan
    Dockhorn, Tim
    Kim, Seung Wook
    Fidler, Sanja
    Kreis, Karsten
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22563 - 22575
  • [4] High-resolution image reconstruction with multisensors
    Bose, NK
    Boo, KJ
    [J]. INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 1998, 9 (04) : 294 - 304
  • [5] MindDiffuser: Controlled Image Reconstruction from Human Brain Activity with Semantic and Structural Diffusion
    Lu, Yizhuo
    Du, Changde
    Zhou, Qiongyi
    Wang, Dianpeng
    He, Huiguang
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5899 - 5908
  • [6] RECONSTRUCTION OF HIGH-RESOLUTION IMAGE FROM NOISE UNDERSAMPLED FRAMES
    KIM, SP
    BOSE, NK
    VALENZUELA, HM
    [J]. ADVANCES IN COMMUNICATIONS AND SIGNAL PROCESSING, 1989, 129 : 315 - 326
  • [7] RECONSTRUCTION OF HIGH-RESOLUTION IMAGE FROM NOISE UNDERSAMPLED FRAMES
    KIM, SP
    BOSE, NK
    VALENZUELA, HM
    [J]. LECTURE NOTES IN CONTROL AND INFORMATION SCIENCES, 1989, 129 : 315 - 326
  • [8] Cascaded Latent Diffusion Models for High-Resolution Chest X-ray Synthesis
    Weber, Tobias
    Ingrisch, Michael
    Bischl, Bernd
    Ruegamer, David
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2023, PT III, 2023, 13937 : 180 - 191
  • [9] High-resolution iris image reconstruction from low-resolution imagery
    Barnard, R.
    Pauca, V. P.
    Torgersen, T. C.
    Plemmons, R. J.
    Prasad, S.
    van der Gracht, J.
    Nagy, J.
    Chung, J.
    Behrmann, G.
    Mathews, S.
    Mirotznik, M.
    [J]. ADVANCED SIGNAL PROCESSING ALGORITHMS, ARCHITECTURES, AND IMPLEMENTATIONS XVI, 2006, 6313
  • [10] High-resolution image reconstruction from multiple low-resolution images
    Wei, H
    Binnie, TD
    [J]. SEVENTH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND ITS APPLICATIONS, 1999, (465): : 596 - 600