Dual conditional GAN based on external attention for semantic image synthesis

被引:2
|
作者
Liu, Gang [1 ]
Zhou, Qijun [1 ,2 ]
Xie, Xiaoxiao [1 ]
Yu, Qingchen [1 ]
机构
[1] Hubei Univ Technol, Sch Comp Sci, Wuhan, Peoples R China
[2] Hubei Univ Technol, Sch Comp Sci, Wuhan 430068, Hubei, Peoples R China
基金
中国国家自然科学基金;
关键词
Generative adversarial networks; semantic image synthesis; attention; graph convolution;
D O I
10.1080/09540091.2023.2259120
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although the existing semantic image synthesis methods based on generative adversarial networks (GANs) have achieved great success, the quality of the generated images still cannot achieve satisfactory results. This is mainly caused by two reasons. One reason is that the information in the semantic layout is sparse. Another reason is that a single constraint cannot effectively control the position relationship between objects in the generated image. To address the above problems, we propose a dual-conditional GAN with based on an external attention for semantic image synthesis (DCSIS). In DCSIS, the adaptive normalization method uses the one-hot encoded semantic layout to generate the first latent space and the external attention uses the RGB encoded semantic layout to generate the second latent space. Two latent spaces control the shape of objects and the positional relationship between objects in the generated image. The graph attention (GAT) is added to the generator to strengthen the relationship between different categories in the generated image. A graph convolutional segmentation network (GSeg) is designed to learn information for each category. Experiments on several challenging datasets demonstrate the advantages of our method over existing approaches, regarding both visual quality and the representative evaluating criteria.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Dual Attention GANs for Semantic Image Synthesis
    Tang, Hao
    Bai, Song
    Sebe, Nicu
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1994 - 2002
  • [2] LAC-GAN: Lesion attention conditional GAN for Ultra-widefield image synthesis
    Lei, Haijun
    Tian, Zhihui
    Xie, Hai
    Zhao, Benjian
    Zeng, Xianlu
    Cao, Jiuwen
    Liu, Weixin
    Wang, Jiantao
    Zhang, Guoming
    Wang, Shuqiang
    Lei, Baiying
    [J]. NEURAL NETWORKS, 2023, 158 : 89 - 98
  • [3] Semantic Image Analogy with a Conditional Single-Image GAN
    Li, Jiacheng
    Xiong, Zhiwei
    Liu, Dong
    Chen, Xuejin
    Zha, Zheng-Jun
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 637 - 645
  • [4] Lightweight dynamic conditional GAN with pyramid attention for text-to-image synthesis
    Gao, Lianli
    Chen, Daiyuan
    Zhao, Zhou
    Shao, Jie
    Shen, Heng Tao
    [J]. PATTERN RECOGNITION, 2021, 110
  • [5] Attention-based dual context aggregation for image semantic segmentation
    Zhao, Dexin
    Qi, Zhiyang
    Yang, Ruixue
    Wang, Zhaohui
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (18) : 28201 - 28216
  • [6] Attention-based dual context aggregation for image semantic segmentation
    Dexin Zhao
    Zhiyang Qi
    Ruixue Yang
    Zhaohui Wang
    [J]. Multimedia Tools and Applications, 2021, 80 : 28201 - 28216
  • [7] On the Diversity of Conditional Image Synthesis With Semantic Layouts
    Yang, Zichen
    Liu, Haifeng
    Cai, Deng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (06) : 2898 - 2907
  • [8] Establishment and evaluation of conditional GAN-based image dataset for semantic segmentation of structural cracks
    Jin, T.
    Ye, X. W.
    Li, Z. X.
    [J]. ENGINEERING STRUCTURES, 2023, 285
  • [9] Image Smear Removal via Improved Conditional GAN and Semantic Network
    Hu, Haijun
    Gao, Bo
    Shen, Zhiyuan
    Zhang, Yu
    [J]. IEEE ACCESS, 2020, 8 : 113104 - 113111
  • [10] Modeling visual and word-conditional semantic attention for image captioning
    Wu, Chunlei
    Wei, Yiwei
    Chu, Xiaoliang
    Su, Fei
    Wang, Leiquan
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2018, 67 : 100 - 107