Text to photo-realistic image synthesis via chained deep recurrent generative adversarial network

被引:2
|
作者
Wang, Min [1 ]
Lang, Congyan [1 ]
Feng, Songhe [1 ]
Wang, Tao [1 ]
Jin, Yi [1 ]
Li, Yidong [1 ]
机构
[1] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Text-to-image synthesis; Logic relationships; Computational bottlenecks; Parameters sharing;
D O I
10.1016/j.jvcir.2020.102955
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite the promising progress made in recent years, automatically generating high-resolution realistic images from text descriptions remains a challenging task due to semantic gap between human-written descriptions and diversities of visual appearance. Most existing approaches generate the rough images with the given text descriptions, while the relationship between sentence semantics and visual content is not holistically exploited. In this paper, we propose a novel chained deep recurrent generative adversarial network (CDRGAN) for synthesizing images from text descriptions. Our model uses carefully designed chained deep recurrent generators that simultaneously recovers global image structures and local details. Specially, our method not only considers the logic relationships of image pixels, but also removes computational bottlenecks through parameters sharing. We evaluate our method on three public benchmarks: CUB, Oxford-102 and MS COCO datasets. Experimental results show that our method significantly outperforms the state-of-the-art approaches consistently across different evaluation metrics.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks
    Zhang, Han
    Xu, Tao
    Li, Hongsheng
    Zhang, Shaoting
    Wang, Xiaogang
    Huang, Xiaolei
    Metaxas, Dimitris
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5908 - 5916
  • [2] Photo-realistic dehazing via contextual generative adversarial networks
    Zhang, Shengdong
    He, Fazhi
    Ren, Wenqi
    [J]. MACHINE VISION AND APPLICATIONS, 2020, 31 (05)
  • [3] Photo-realistic dehazing via contextual generative adversarial networks
    Shengdong Zhang
    Fazhi He
    Wenqi Ren
    [J]. Machine Vision and Applications, 2020, 31
  • [4] Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
    Ledig, Christian
    Theis, Lucas
    Huszar, Ferenc
    Caballero, Jose
    Cunningham, Andrew
    Acosta, Alejandro
    Aitken, Andrew
    Tejani, Alykhan
    Totz, Johannes
    Wang, Zehan
    Shi, Wenzhe
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 105 - 114
  • [5] SEMANTICGAN: GENERATIVE ADVERSARIAL NETWORKS FOR SEMANTIC IMAGE TO PHOTO-REALISTIC IMAGE TRANSLATION
    Liu, Junling
    Zou, Yuexian
    Yang, Dongming
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2528 - 2532
  • [6] SIMGAN: PHOTO-REALISTIC SEMANTIC IMAGE MANIPULATION USING GENERATIVE ADVERSARIAL NETWORKS
    Yu, Simiao
    Dong, Hao
    Liang, Felix
    Mo, Yuanhan
    Wu, Chao
    Guo, Yike
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 734 - 738
  • [7] Hierarchically-fused Generative Adversarial Network for text to realistic image synthesis
    Huang, Xin
    Wang, Mingjie
    Gong, Minglun
    [J]. 2019 16TH CONFERENCE ON COMPUTER AND ROBOT VISION (CRV 2019), 2019, : 73 - 80
  • [8] Photo-realistic face age progression/regression using a single generative adversarial network
    Zeng, Jiangfeng
    Ma, Xiao
    Zhou, Ke
    [J]. NEUROCOMPUTING, 2019, 366 : 295 - 304
  • [9] Photo-Realistic Image Dehazing and Verifying Networks via Complementary Adversarial Learning
    Shin, Joongchol
    Paik, Joonki
    [J]. SENSORS, 2021, 21 (18)
  • [10] Photo-Realistic Monocular Gaze Redirection Using Generative Adversarial Networks
    He, Zhe
    Spurr, Adrian
    Zhang, Xucong
    Hilliges, Otmar
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6931 - 6940