Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder

被引:8
|
作者
Lin, Xinmiao [1 ]
Li, Yikang [2 ]
Hsiao, Jenhao [2 ]
Ho, Chiuman [2 ]
Kong, Yu [3 ]
机构
[1] Rochester Inst Technol, Rochester, MN USA
[2] OPPO US Res, Shenzhen, Peoples R China
[3] Michigan State Univ, E Lansing, MI USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR52729.2023.00173
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The popular VQ-VAE models reconstruct images through learning a discrete codebook but suffer from a significant issue in the rapid quality degradation of image reconstruction as the compression rate rises. One major reason is that a higher compression rate induces more loss of visual signals on the higher frequency spectrum which reflect the details on pixel space. In this paper, a Frequency Complement Module (FCM) architecture is proposed to capture the missing frequency information for enhancing reconstruction quality. The FCM can be easily incorporated into the VQ-VAE structure, and we refer to the new model as Frequancy Augmented VAE (FA-VAE). In addition, a Dynamic Spectrum Loss (DSL) is introduced to guide the FCMs to balance between various frequencies dynamically for optimal reconstruction. FA-VAE is further extended to the text-to-image synthesis task, and a Cross-attention Autoregressive Transformer (CAT) is proposed to obtain more precise semantic attributes in texts. Extensive reconstruction experiments with different compression rates are conducted on several benchmark datasets, and the results demonstrate that the proposed FA-VAE is able to restore more faithfully the details compared to SOTA methods. CAT also shows improved generation quality with better image-text semantic alignment.
引用
收藏
页码:1736 / 1745
页数:10
相关论文
共 50 条
  • [1] Conditional Variational Autoencoder for Learned Image Reconstruction
    Zhang, Chen
    Barbano, Riccardo
    Jin, Bangti
    COMPUTATION, 2021, 9 (11)
  • [2] Inference-Reconstruction Variational Autoencoder for Light Field Image Reconstruction
    Han, Kang
    Xiang, Wei
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 5629 - 5644
  • [3] Facial Image Inpainting with Variational Autoencoder
    Tu, Ching-Ting
    Chen, Yi-Fu
    2019 2ND INTERNATIONAL CONFERENCE OF INTELLIGENT ROBOTIC AND CONTROL ENGINEERING (IRCE 2019), 2019, : 119 - 122
  • [4] Imputation of Missing Values in Training Data using Variational Autoencoder
    Hong, Xuerui
    Hao, Shuang
    2023 IEEE 39TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS, ICDEW, 2023, : 49 - 54
  • [5] MISSING DATA IN TRAFFIC ESTIMATION: A VARIATIONAL AUTOENCODER IMPUTATION METHOD
    Boquet, Guillem
    Lopez Vicario, Jose
    Morell, Antoni
    Serrano, Javier
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2882 - 2886
  • [6] Conditional Introspective Variational Autoencoder for Image Synthesis
    Zheng, Kun
    Cheng, Yafan
    Kang, Xiaojun
    Yao, Hong
    Tian, Tian
    IEEE ACCESS, 2020, 8 (08): : 153905 - 153913
  • [7] Face Image Inpainting via Variational Autoencoder
    Zhang X.
    Cheng L.
    Bai S.
    Zhang F.
    Sun N.
    Wang Z.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2020, 32 (03): : 401 - 409
  • [8] CosmoVAE: Variational Autoencoder for CMB Image Inpainting
    Yi, Kai
    Guo, Yi
    Fan, Yanan
    Hamann, Jan
    Wang, Yu Guang
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [9] Medical Image Compression Based on Variational Autoencoder
    Liu, Xuan
    Zhang, Lu
    Guo, Zihao
    Han, Tailin
    Ju, Mingchi
    Xu, Bo
    Liu, Hong
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [10] Denoising Graph Autoencoder for Missing Human Joints Reconstruction
    Lee, Wonseok
    Park, Seonghee
    Kim, Taejoon
    IEEE ACCESS, 2024, 12 : 57381 - 57389