Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder

被引：8

作者：

Lin, Xinmiao ^{[1
]}

Li, Yikang ^{[2
]}

Hsiao, Jenhao ^{[2
]}

Ho, Chiuman ^{[2
]}

Kong, Yu ^{[3
]}

机构：

[1] Rochester Inst Technol, Rochester, MN USA

[2] OPPO US Res, Shenzhen, Peoples R China

[3] Michigan State Univ, E Lansing, MI USA

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/CVPR52729.2023.00173

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The popular VQ-VAE models reconstruct images through learning a discrete codebook but suffer from a significant issue in the rapid quality degradation of image reconstruction as the compression rate rises. One major reason is that a higher compression rate induces more loss of visual signals on the higher frequency spectrum which reflect the details on pixel space. In this paper, a Frequency Complement Module (FCM) architecture is proposed to capture the missing frequency information for enhancing reconstruction quality. The FCM can be easily incorporated into the VQ-VAE structure, and we refer to the new model as Frequancy Augmented VAE (FA-VAE). In addition, a Dynamic Spectrum Loss (DSL) is introduced to guide the FCMs to balance between various frequencies dynamically for optimal reconstruction. FA-VAE is further extended to the text-to-image synthesis task, and a Cross-attention Autoregressive Transformer (CAT) is proposed to obtain more precise semantic attributes in texts. Extensive reconstruction experiments with different compression rates are conducted on several benchmark datasets, and the results demonstrate that the proposed FA-VAE is able to restore more faithfully the details compared to SOTA methods. CAT also shows improved generation quality with better image-text semantic alignment.

引用

页码：1736 / 1745

页数：10

共 50 条

[1] Conditional Variational Autoencoder for Learned Image Reconstruction
Zhang, Chen
Barbano, Riccardo
Jin, Bangti
COMPUTATION, 2021, 9 (11)
[2] Inference-Reconstruction Variational Autoencoder for Light Field Image Reconstruction
Han, Kang
Xiang, Wei
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 5629 - 5644
[3] Facial Image Inpainting with Variational Autoencoder
Tu, Ching-Ting
Chen, Yi-Fu
2019 2ND INTERNATIONAL CONFERENCE OF INTELLIGENT ROBOTIC AND CONTROL ENGINEERING (IRCE 2019), 2019, : 119 - 122
[4] Imputation of Missing Values in Training Data using Variational Autoencoder
Hong, Xuerui
Hao, Shuang
2023 IEEE 39TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS, ICDEW, 2023, : 49 - 54
[5] MISSING DATA IN TRAFFIC ESTIMATION: A VARIATIONAL AUTOENCODER IMPUTATION METHOD
Boquet, Guillem
Lopez Vicario, Jose
Morell, Antoni
Serrano, Javier
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2882 - 2886
[6] Conditional Introspective Variational Autoencoder for Image Synthesis
Zheng, Kun
Cheng, Yafan
Kang, Xiaojun
Yao, Hong
Tian, Tian
IEEE ACCESS, 2020, 8 (08): : 153905 - 153913
[7] Face Image Inpainting via Variational Autoencoder
Zhang X.
Cheng L.
Bai S.
Zhang F.
Sun N.
Wang Z.
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2020, 32 (03): : 401 - 409
[8] CosmoVAE: Variational Autoencoder for CMB Image Inpainting
Yi, Kai
Guo, Yi
Fan, Yanan
Hamann, Jan
Wang, Yu Guang
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[9] Medical Image Compression Based on Variational Autoencoder
Liu, Xuan
Zhang, Lu
Guo, Zihao
Han, Tailin
Ju, Mingchi
Xu, Bo
Liu, Hong
MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
[10] Denoising Graph Autoencoder for Missing Human Joints Reconstruction
Lee, Wonseok
Park, Seonghee
Kim, Taejoon
IEEE ACCESS, 2024, 12 : 57381 - 57389

← 1 2 3 4 5 →