Swin-GAN: generative adversarial network based on shifted windows transformer architecture for image generation

被引:5
|
作者
Wang, Shibin [1 ]
Gao, Zidiao [1 ]
Liu, Dong [1 ]
机构
[1] Henan Normal Univ, Sch Comp & Informat Engn, Xinxiang 453007, Henan, Peoples R China
来源
VISUAL COMPUTER | 2023年 / 39卷 / 12期
基金
中国国家自然科学基金;
关键词
GAN; Transformer; Self-attention; Image generation;
D O I
10.1007/s00371-022-02714-9
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
It is well known that every successful generative adversarial network (GAN) relies on the convolutional neural networks (CNN)-based generators and discriminators. However, CNN cannot process the long-range dependencies because its convolution operator has a local receptive field, which can bring some issues to GAN, such as the optimization, the loss of feature resolution and the fine details. To meet the problem of long-term dependence, we propose a GAN model based on shifted windows Transformer architecture, called Swin-GAN, in which the CNN architecture is replaced by Transformer. In our model, we build a memory-friendly generator based on the shifted window attention mechanism to gradually increase the resolution of feature maps at each stage. Another, we build a multi-scale discriminator to split the image into patches of different sizes as the input at different stages, which can achieve the balance between capturing global contextual semantic information and local detailed features. To further improve the fidelity and stability, we use the techniques such as data enhancement, layer normalization and relative position coding in our model. Compared with the current schemes, the experimental results show that our scheme has better performance, fewer parameters and lower computational cost. Specifically, Params value of Swin-GAN model is 30.254M, and Floating-Point Operations Per Second (FLOPs) value is 4.086G. Inception Score (IS) is 9.04 and Frechet Inception Distance (FID) is 9.23 in CIFAR-10.
引用
收藏
页码:6085 / 6095
页数:11
相关论文
共 50 条
  • [41] A transformer generative adversarial network for multi-track music generation
    Jin, Cong
    Wang, Tao
    Li, Xiaobing
    Tie, Chu Jie Jiessie
    Tie, Yun
    Liu, Shan
    Yan, Ming
    Li, Yongzhi
    Wang, Junxian
    Huang, Shenze
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2022, 7 (03) : 369 - 380
  • [42] LC-GAN: Image-to-image Translation Based on Generative Adversarial Network for Endoscopic Images
    Lin, Shan
    Qin, Fangbo
    Li, Yangming
    Bly, Randall A.
    Moe, Kris S.
    Hannaford, Blake
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 2914 - 2920
  • [43] EL-GAN: Edge-Enhanced Generative Adversarial Network for Layout-to-Image Generation
    Gao, Lin
    Wu, Lei
    Meng, Xiangxu
    COMPUTER GRAPHICS FORUM, 2022, 41 (07) : 407 - 418
  • [44] DSE-GAN: Dynamic Semantic Evolution Generative Adversarial Network for Text-to-Image Generation
    Huang, Mengqi
    Mao, Zhendong
    Wang, Penghui
    Wang, Quan
    Zhang, Yongdong
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4345 - 4354
  • [45] FISS GAN: A Generative Adversarial Network for Foggy Image Semantic Segmentation
    Liu, Kunhua
    Ye, Zihao
    Guo, Hongyan
    Cao, Dongpu
    Chen, Long
    Wang, Fei-Yue
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2021, 8 (08) : 1428 - 1439
  • [46] Network Traffic Anomaly Detection Based on Generative Adversarial Network and Transformer
    Wang, Zhurong
    Zhou, Jing
    Hei, Xinhong
    ADVANCES IN NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, ICNC-FSKD 2022, 2023, 153 : 228 - 235
  • [47] FISS GAN: A Generative Adversarial Network for Foggy Image Semantic Segmentation
    Kunhua Liu
    Zihao Ye
    Hongyan Guo
    Dongpu Cao
    Long Chen
    Fei-Yue Wang
    IEEE/CAAJournalofAutomaticaSinica, 2021, 8 (08) : 1428 - 1439
  • [48] DAC-GAN: Dual Auxiliary Consistency Generative Adversarial Network for Text-to-Image Generation
    Wang, Zhiwei
    Yang, Jing
    Cui, Jiajun
    Liu, Jiawei
    Wang, Jiahao
    COMPUTER VISION - ACCV 2022, PT VII, 2023, 13847 : 3 - 19
  • [49] Task-GAN: Improving Generative Adversarial Network for Image Reconstruction
    Ouyang, Jiahong
    Wang, Guanhua
    Gong, Enhao
    Chen, Kevin
    Pauly, John
    Zaharchuk, Greg
    MACHINE LEARNING FOR MEDICAL IMAGE RECONSTRUCTION, MLMIR 2019, 2019, 11905 : 193 - 204
  • [50] Single Image Desnow Based on Vision Transformer and Conditional Generative Adversarial Network for Internet of Vehicles
    Wei, Bingcai
    Wang, Di
    Wang, Zhuang
    Zhang, Liye
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2023, 137 (02): : 1975 - 1988