Semantics-enhanced Adversarial Nets for Text-to-Image Synthesis

被引:50
|
作者
Tan, Hongchen [1 ]
Liu, Xiuping [1 ]
Li, Xin [2 ]
Zhang, Yi [1 ]
Yin, Baocai [1 ,3 ]
机构
[1] Dalian Univ Technol, Dalian, Peoples R China
[2] Louisiana State Univ, Baton Rouge, LA 70803 USA
[3] Peng Cheng Lab, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICCV.2019.01060
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new model, Semantics-enhanced Generative Adversarial Network (SEGAN), for fine-grained text-to-image generation. We introduce two modules, a Semantic Consistency Module (SCM) and an Attention Competition Module (ACM), to our SEGAN. The SCM incorporates image-level semantic consistency into the training of the Generative Adversarial Network (GAN), and can diversify the generated images and improve their structural coherence. A Siamese network and two types of semantic similarities are designed to map the synthesized image and the groundtruth image to nearby points in the latent semantic feature space. The ACM constructs adaptive attention weights to differentiate keywords from unimportant words, and improves the stability and accuracy of SEGAN. Extensive experiments demonstrate that our SEGAN significantly outperforms existing state-of-the-art methods in generating photo-realistic images. All source codes and models will be released for comparative study.
引用
收藏
页码:10500 / 10509
页数:10
相关论文
共 50 条
  • [1] ADVERSARIAL NETS WITH PERCEPTUAL LOSSES FOR TEXT-TO-IMAGE SYNTHESIS
    Cha, Miriam
    Gwon, Youngjune
    Kung, H. T.
    [J]. 2017 IEEE 27TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2017,
  • [2] Enhanced Text-to-Image Synthesis Conditional Generative Adversarial Networks
    Tan, Yong Xuan
    Lee, Chin Poo
    Neo, Mai
    Lim, Kian Ming
    Lim, Jit Yan
    [J]. IAENG International Journal of Computer Science, 2022, 49 (01) : 1 - 7
  • [3] Adversarial text-to-image synthesis: A review
    Frolov, Stanislav
    Hinz, Tobias
    Raue, Federico
    Hees, Joern
    Dengel, Andreas
    [J]. NEURAL NETWORKS, 2021, 144 : 187 - 209
  • [4] Dual Adversarial Inference for Text-to-Image Synthesis
    Lao, Qicheng
    Havaei, Mohammad
    Pesaranghader, Ahmad
    Dutil, Francis
    Di Jorio, Lisa
    Fevens, Thomas
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7566 - 7575
  • [5] Perceptual Pyramid Adversarial Networks for Text-to-Image Synthesis
    Gao, Lianli
    Chen, Daiyuan
    Song, Jingkuan
    Xu, Xing
    Zhang, Dongxiang
    Shen, Heng Tao
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 8312 - 8319
  • [6] GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
    Tao, Ming
    Bao, Bing-Kun
    Tang, Hao
    Xu, Changsheng
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14214 - 14223
  • [7] Semantic Distance Adversarial Learning for Text-to-Image Synthesis
    Yuan, Bowen
    Sheng, Yefei
    Bao, Bing-Kun
    Chen, Yi-Ping Phoebe
    Xu, Changsheng
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 1255 - 1266
  • [8] Semantics Disentangling for Text-to-Image Generation
    Yin, Guojun
    Liu, Bin
    Sheng, Lu
    Yu, Nenghai
    Wang, Xiaogang
    Shao, Jing
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2322 - 2331
  • [9] A survey of generative adversarial networks and their application in text-to-image synthesis
    Zeng, Wu
    Zhu, Heng-liang
    Lin, Chuan
    Xiao, Zheng-ying
    [J]. ELECTRONIC RESEARCH ARCHIVE, 2023, 31 (12): : 7142 - 7181
  • [10] Survey About Generative Adversarial Network and Text-to-Image Synthesis
    Lai, Lina
    Mi, Yu
    Zhou, Longlong
    Rao, Jiyong
    Xu, Tianyang
    Song, Xiaoning
    [J]. Computer Engineering and Applications, 2023, 59 (19): : 21 - 39