Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder

被引：8

作者：

Lin, Xinmiao ^{[1
]}

Li, Yikang ^{[2
]}

Hsiao, Jenhao ^{[2
]}

Ho, Chiuman ^{[2
]}

Kong, Yu ^{[3
]}

机构：

[1] Rochester Inst Technol, Rochester, MN USA

[2] OPPO US Res, Shenzhen, Peoples R China

[3] Michigan State Univ, E Lansing, MI USA

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/CVPR52729.2023.00173

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The popular VQ-VAE models reconstruct images through learning a discrete codebook but suffer from a significant issue in the rapid quality degradation of image reconstruction as the compression rate rises. One major reason is that a higher compression rate induces more loss of visual signals on the higher frequency spectrum which reflect the details on pixel space. In this paper, a Frequency Complement Module (FCM) architecture is proposed to capture the missing frequency information for enhancing reconstruction quality. The FCM can be easily incorporated into the VQ-VAE structure, and we refer to the new model as Frequancy Augmented VAE (FA-VAE). In addition, a Dynamic Spectrum Loss (DSL) is introduced to guide the FCMs to balance between various frequencies dynamically for optimal reconstruction. FA-VAE is further extended to the text-to-image synthesis task, and a Cross-attention Autoregressive Transformer (CAT) is proposed to obtain more precise semantic attributes in texts. Extensive reconstruction experiments with different compression rates are conducted on several benchmark datasets, and the results demonstrate that the proposed FA-VAE is able to restore more faithfully the details compared to SOTA methods. CAT also shows improved generation quality with better image-text semantic alignment.

引用

页码：1736 / 1745

页数：10

共 50 条

[21] Variational Autoencoder Reconstruction of Complex Many-Body Physics
Luchnikov, Ilia A.
Ryzhov, Alexander
Stas, Pieter-Jan
Filippov, Sergey N.
Ouerdane, Henni
ENTROPY, 2019, 21 (11)
[22] A Data Reconstruction Method based on Adversarial Conditional Variational Autoencoder
Ren, Yifu
Liu, Jinhai
Zhang, Jianan
Jiang, Lin
Luo, Yanhong
PROCEEDINGS OF 2020 IEEE 9TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS'20), 2020, : 622 - 626
[23] Perceptual Autoencoder for Compressive Sensing Image Reconstruction
Ralasic, Ivan
Sersic, Damir
Segvic, Sinisa
INFORMATICA, 2020, 31 (03) : 561 - 578
[24] Synthetic Aperture Radar Image Compression Based on a Variational Autoencoder
Xu, Qihan
Xiang, Yunfan
Di, Zhixiong
Fan, Yibo
Feng, Quanyuan
Wu, Qiang
Shi, Jiangyi
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[25] FusionVAE: A Deep Hierarchical Variational Autoencoder for RGB Image Fusion
Duffhauss, Fabian
Ngo Anh Vien
Ziesche, Hanna
Neumann, Gerhard
COMPUTER VISION, ECCV 2022, PT XXXIX, 2022, 13699 : 674 - 691
[26] Image Augmentation based on Variational Autoencoder for Breast Tumor Segmentation
Balaji, K.
ACADEMIC RADIOLOGY, 2023, 30 : S172 - S183
[27] Variational AutoEncoder for Reference based Image Super-Resolution
Liu, Zhi-Song
Siu, Wan-Chi
Wang, Li-Wen
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 516 - 525
[28] A Robust Image Watermarking Approach Using Cycle Variational Autoencoder
Wei, Qiang
Wang, Hu
Zhang, Gongxuan
SECURITY AND COMMUNICATION NETWORKS, 2020, 2020
[29] Radio Frequency Fingerprint Identification Based on Variational Autoencoder for GNSS
Jiang, Qi
Sha, Jin
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21
[30] On a variational problem arising in image reconstruction
Ambrosio, L
Masnou, S
FREE BOUNDARY PROBLEMS: THEORY AND APPLICATIONS, 2004, 147 : 17 - 26

← 1 2 3 4 5 →