UVCGAN: UNet Vision Transformer cycle-consistent GAN for unpaired image-to-image translation

被引:76
|
作者
Torbunov, Dmitrii [1 ]
Huang, Yi [1 ]
Yu, Haiwang [1 ]
Huang, Jin [1 ]
Yoo, Shinjae [1 ]
Lin, Meifeng [1 ]
Viren, Brett [1 ]
Ren, Yihui [1 ]
机构
[1] Brookhaven Natl Lab, Upton, NY 11973 USA
关键词
D O I
10.1109/WACV56688.2023.00077
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unpaired image-to-image translation has broad applications in art, design, and scientific simulations. One early breakthrough was CycleGAN that emphasizes one-to-one mappings between two unpaired image domains via generative-adversarial networks (GAN) coupled with the cycle-consistency constraint, while more recent works promote one-to-many mapping to boost diversity of the translated images. Motivated by scientific simulation and one-to-one needs, this work revisits the classic CycleGAN framework and boosts its performance to outperform more contemporary models without relaxing the cycle-consistency constraint. To achieve this, we equip the generator with a Vision Transformer (ViT) and employ necessary training and regularization techniques. Compared to previous best-performing models, our model performs better and retains a strong correlation between the original and translated image. An accompanying ablation study shows that both the gradient penalty and self-supervised pre-training are crucial to the improvement. To promote reproducibility and open science, the source code, hyperparameter configurations, and pre-trained model are available at https: //github.com/LS4GAN/uvcgan.
引用
收藏
页码:702 / 712
页数:11
相关论文
共 50 条
  • [41] Cross-Domain Interpolation for Unpaired Image-to-Image Translation
    Lopez, Jorge
    Mauricio, Antoni
    Diaz, Jose
    Camara, Guillermo
    COMPUTER VISION SYSTEMS (ICVS 2019), 2019, 11754 : 542 - 551
  • [42] Unpaired Image-to-Image Translation via Latent Energy Transport
    Zhao, Yang
    Chen, Changyou
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16413 - 16422
  • [43] One-to-one Mapping for Unpaired Image-to-image Translation
    Shen, Zengming
    Chen, Yifan
    Huang, Thomas S.
    Zhou, S. Kevin
    Georgescu, Bogdan
    Liu, Xuqi
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1159 - 1168
  • [44] Maximum Spatial Perturbation Consistency for Unpaired Image-to-Image Translation
    Xu, Yanwu
    Xie, Shaoan
    Wu, Wenhao
    Zhang, Kun
    Gong, Mingming
    Batmanghelich, Kayhan
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18290 - 18299
  • [45] Cycle consistent twin energy-based models for image-to-image translation
    Tiwary, Piyush
    Bhattacharyya, Kinjawl
    Prathosh, A. P.
    MEDICAL IMAGE ANALYSIS, 2024, 91
  • [46] DehazeGAN: Underwater Haze Image Restoration using Unpaired Image-to-image Translation
    Cho, Younggun
    Malav, Ramavtar
    Pandey, Gaurav
    Kim, Ayoung
    IFAC PAPERSONLINE, 2019, 52 (21): : 82 - 85
  • [47] Multimodal Structure-Consistent Image-to-Image Translation
    Lin, Che-Tsung
    Wu, Yen-Yi
    Hsu, Po-Hao
    Lai, Shang-Hong
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11490 - 11498
  • [48] Multi-feature contrastive learning for unpaired image-to-image translation
    Yao Gou
    Min Li
    Yu Song
    Yujie He
    Litao Wang
    Complex & Intelligent Systems, 2023, 9 : 4111 - 4122
  • [49] Unsupervised Structure-Consistent Image-to-Image Translation
    Shahfar, Shima
    Poullis, Charalambos
    ADVANCES IN VISUAL COMPUTING, ISVC 2022, PT I, 2022, 13598 : 3 - 21
  • [50] Asynchronous Generative Adversarial Network for Asymmetric Unpaired Image-to-Image Translation
    Zheng, Ziqiang
    Bin, Yi
    Lv, Xiaoou
    Wu, Yang
    Yang, Yang
    Shen, Heng Tao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2474 - 2487