Image-to-image translation using an offset-basedmulti-scale codes GAN encoder

被引:8
|
作者
Guo, Zihao [1 ]
Shao, Mingwen [1 ]
Li, Shunhang [1 ]
机构
[1] China Univ Petr, Coll Comp Sci & Technol, Qingdao, Peoples R China
来源
VISUAL COMPUTER | 2024年 / 40卷 / 02期
基金
中国国家自然科学基金;
关键词
Generative adversarial networks; GAN inversion; Image-to-image translation; Super-resolution; Conditional face synthesis;
D O I
10.1007/s00371-023-02810-4
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Despite the remarkable achievements of generative adversarial networks (GANs) in high-quality image synthesis, applying pre-trained GAN models to image-to-image translation is still challenging. Previous approaches typically map the conditional image into the latent spaces of GANs by per-image optimization or learning a GAN encoder. However, neither of these two methods can ideally perform image-to-image translation tasks. In this work, we propose a novel learning-based framework which can complete common image-to-image translation tasks with high quality in real-time based on pre-trained GANs. Specifically, to mitigate the semantic misalignment between conditional and synthesized images, we propose an offset-based image synthesis method that allows our encoder to use multiple rather than one forward propagation to predict the latent codes. During the multiple forward passes, the final latent codes are adjusted continuously according to the semantic difference between the conditional image and the current synthesized image. To further reduce the loss of details during encoding, we extract multiple latent codes at multiple scales from input instead of a single code to synthesize the image. Moreover, we propose an optional multiple feature maps fusion module that combines our encoder with different generators to implement our multiple latent codes strategies. Finally, we analyze the performance and demonstrate the effectiveness of our framework by comparing it with state-of-the-art works on super-resolution and conditional face synthesis tasks.
引用
收藏
页码:699 / 715
页数:17
相关论文
共 50 条
  • [1] Image-to-image translation using an offset-based multi-scale codes GAN encoder
    Zihao Guo
    Mingwen Shao
    Shunhang Li
    The Visual Computer, 2024, 40 (2) : 699 - 715
  • [2] Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation
    Richardson, Elad
    Alaluf, Yuval
    Patashnik, Or
    Nitzan, Yotam
    Azar, Yaniv
    Shapiro, Stav
    Cohen-Or, Daniel
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 2287 - 2296
  • [3] Photogenic Guided Image-to-Image Translation With Single Encoder
    Oh, Rina
    Gonsalves, T.
    IEEE OPEN JOURNAL OF THE COMPUTER SOCIETY, 2024, 5 : 624 - 635
  • [4] Consistent Embedded GAN for Image-to-Image Translation
    Xiong, Feng
    Wang, Qianqian
    Gao, Quanxue
    IEEE ACCESS, 2019, 7 : 126651 - 126661
  • [5] Asymmetric GAN for Unpaired Image-to-Image Translation
    Li, Yu
    Tang, Sheng
    Zhang, Rui
    Zhang, Yongdong
    Li, Jintao
    Yan, Shuicheng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (12) : 5881 - 5896
  • [6] SUNIT: multimodal unsupervised image-to-image translation with shared encoder
    Lin, Liyuan
    Ji, Shulin
    Zhou, Yuan
    Zhang, Shun
    JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (01)
  • [7] Weakly Supervised GAN for Image-to-Image Translation in the Wild
    Cao, Zhiyi
    Niu, Shaozhang
    Zhang, Jiwei
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020
  • [8] SPA-GAN: Spatial Attention GAN for Image-to-Image Translation
    Emami, Hajar
    Aliabadi, Majid Moradi
    Dong, Ming
    Chinnam, Ratna Babu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 391 - 401
  • [9] Image-To-Image Translation Using a Cross-Domain Auto-Encoder and Decoder
    Yoo, Jaechang
    Eom, Heesong
    Choi, Yong Suk
    APPLIED SCIENCES-BASEL, 2019, 9 (22):
  • [10] Multimodal Unsupervised Image-to-Image Translation Without Independent Style Encoder
    Sun, Yanbei
    Lu, Yao
    Lu, Haowei
    Zhao, Qingjie
    Wang, Shunzhou
    MULTIMEDIA MODELING (MMM 2022), PT I, 2022, 13141 : 624 - 636