Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder

被引:8
|
作者
Lin, Xinmiao [1 ]
Li, Yikang [2 ]
Hsiao, Jenhao [2 ]
Ho, Chiuman [2 ]
Kong, Yu [3 ]
机构
[1] Rochester Inst Technol, Rochester, MN USA
[2] OPPO US Res, Shenzhen, Peoples R China
[3] Michigan State Univ, E Lansing, MI USA
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR52729.2023.00173
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The popular VQ-VAE models reconstruct images through learning a discrete codebook but suffer from a significant issue in the rapid quality degradation of image reconstruction as the compression rate rises. One major reason is that a higher compression rate induces more loss of visual signals on the higher frequency spectrum which reflect the details on pixel space. In this paper, a Frequency Complement Module (FCM) architecture is proposed to capture the missing frequency information for enhancing reconstruction quality. The FCM can be easily incorporated into the VQ-VAE structure, and we refer to the new model as Frequancy Augmented VAE (FA-VAE). In addition, a Dynamic Spectrum Loss (DSL) is introduced to guide the FCMs to balance between various frequencies dynamically for optimal reconstruction. FA-VAE is further extended to the text-to-image synthesis task, and a Cross-attention Autoregressive Transformer (CAT) is proposed to obtain more precise semantic attributes in texts. Extensive reconstruction experiments with different compression rates are conducted on several benchmark datasets, and the results demonstrate that the proposed FA-VAE is able to restore more faithfully the details compared to SOTA methods. CAT also shows improved generation quality with better image-text semantic alignment.
引用
收藏
页码:1736 / 1745
页数:10
相关论文
共 50 条
  • [31] Autoencoder With Invertible Functions for Dimension Reduction and Image Reconstruction
    Yang, Yimin
    Wu, Q. M. Jonathan
    Wang, Yaonan
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2018, 48 (07): : 1065 - 1079
  • [32] Unmixing Autoencoder for Image Reconstruction from Hyperspectral Data
    Liu, Xuyang
    Duan, Chaoshu
    Cai, Wensheng
    Shao, Xueguang
    ANALYTICAL CHEMISTRY, 2024, 96 (52) : 20354 - 20361
  • [33] VAE-BRIDGE: Variational Autoencoder Filter for Bayesian Ridge Imputation of Missing Data
    Pereira, Ricardo Cardoso
    Abreu, Pedro Henriques
    Rodrigues, Pedro Pereira
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [34] A topic modeling and image classification framework: The Generalized Dirichlet variational autoencoder
    Ojo, Akinlolu Oluwabusayo
    Bouguila, Nizar
    PATTERN RECOGNITION, 2024, 146
  • [35] Remote sensing image captioning via Variational Autoencoder and Reinforcement Learning
    Shen, Xiangqing
    Liu, Bing
    Zhou, Yong
    Zhao, Jiaqi
    Liu, Mingming
    KNOWLEDGE-BASED SYSTEMS, 2020, 203
  • [36] Image forgery detection by combining Visual Transformer with Variational Autoencoder Network
    Atak, Ilker Galip
    Yasar, Ali
    APPLIED SOFT COMPUTING, 2024, 165
  • [37] AVAFN-adaptive variational autoencoder fusion network for multispectral image
    Chu, Wen-Lin
    Tu, Ching-Che
    Jian, Bo-Lin
    Multimedia Tools and Applications, 2024, 83 (41) : 89297 - 89315
  • [38] Learning an Optimisable Semantic Segmentation Map with Image Conditioned Variational Autoencoder
    Zhuang, Pengcheng
    Sekikawa, Yusuke
    Hara, Kosuke
    Saito, Hideo
    IMAGE ANALYSIS AND PROCESSING - ICIAP 2019, PT II, 2019, 11752 : 379 - 389
  • [39] Semisupervised SAR image change detection based on a siamese variational autoencoder
    Zhao, Guangwei
    Peng, Yaxin
    INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (01)
  • [40] A novel binary quantizer for variational autoencoder-based image compressor
    Thulasidharan, Pillai Praveen
    Nath, Keshab
    International Journal of Computers and Applications, 2024, 46 (08) : 604 - 620