Text-based Sequential Image Generation

被引:0
|
作者
Efimova, Valeria [1 ,2 ]
Filchenkov, Andrey [1 ]
机构
[1] ITMO Univ, St Petersburg, Russia
[2] Statanly Technol, St Petersburg, Russia
关键词
text-to-image generation; transformer; layout generation;
D O I
10.1117/12.2622734
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
Despite recent impressive results of generative adversarial networks on text-to-image generation, the generation of complex scenes with multiple objects in the complicated background remains challenging; moreover, end-to-end text-to-image generation still suffers from poor image quality. In this work, we propose a sequential algorithm of text-to-image generation, which allows synthesizing high-quality images (more than 1024x1024 pixels). The proposed approach consists of location inference, key objects extraction, image search, layout generation, and image harmonization stages. We compare the suggested approach with state-of-the-art image generation model DALL-E with text-to-image mapping. Our approach demonstrates the effectiveness and visual plausibility of the generated images based on golden section layouts.
引用
下载
收藏
页数:8
相关论文
共 50 条
  • [1] TiGAN: Text-Based Interactive Image Generation and Manipulation
    Zhou, Yufan
    Zhang, Ruiyi
    Gu, Jiuxiang
    Tensmeyer, Chris
    Yu, Tong
    Chen, Changyou
    Xu, Jinhui
    Sun, Tong
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3580 - 3588
  • [2] Text-Based Image Segmentation Methodology
    Mehul, Gupta
    Ankita, Patel
    Namrata, Dave
    Rahul, Goradia
    Sheth, Saurin
    2ND INTERNATIONAL CONFERENCE ON INNOVATIONS IN AUTOMATION AND MECHATRONICS ENGINEERING, ICIAME 2014, 2014, 14 : 465 - 472
  • [3] Coloring with Words: Guiding Image Colorization Through Text-Based Palette Generation
    Bahng, Hyojin
    Yoo, Seungjoo
    Cho, Wonwoong
    Park, David Keetae
    Wu, Ziming
    Ma, Xiaojuan
    Choo, Jaegul
    COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 : 443 - 459
  • [4] Image Sense Classification in Text-Based Image Retrieval
    Chang, Yih-Chen
    Chen, Hsin-Hsi
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2009, 5839 : 124 - 135
  • [5] 3D IMAGE GENERATION FOR TRANSLATION OF TEXT-BASED MEDICAL QUESTIONNAIRES
    Stothers, B.
    Macnab, A.
    JOURNAL OF INVESTIGATIVE MEDICINE, 2020, 68 : A76 - A77
  • [6] Text-based Image Style Transfer and Synthesis
    He, Yifan
    Li, Jian
    Zhu, Anna
    2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION WORKSHOPS (ICDARW) AND 8TH INTERNATIONAL WORKSHOP ON CAMERA-BASED DOCUMENT ANALYSIS AND RECOGNITION, VOL 4, 2019, : 43 - 48
  • [7] Image Captioning with Text-Based Visual Attention
    Chen He
    Haifeng Hu
    Neural Processing Letters, 2019, 49 : 177 - 185
  • [8] Controllable Video Generation With Text-Based Instructions
    Koksal, Ali
    Ak, Kenan E.
    Sun, Ying
    Rajan, Deepu
    Lim, Joo Hwee
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 190 - 201
  • [9] A Scene Text-Based Image Retrieval System
    Thuy Ho
    Ngoc Ly
    2012 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2012, : 79 - 84
  • [10] Switching Text-Based Image Encoders for Captioning Images With Text
    Ueda, Arisa
    Yang, Wei
    Sugiura, Komei
    IEEE ACCESS, 2023, 11 : 55706 - 55715