Text-based Sequential Image Generation

被引:0
|
作者
Efimova, Valeria [1 ,2 ]
Filchenkov, Andrey [1 ]
机构
[1] ITMO Univ, St Petersburg, Russia
[2] Statanly Technol, St Petersburg, Russia
关键词
text-to-image generation; transformer; layout generation;
D O I
10.1117/12.2622734
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
Despite recent impressive results of generative adversarial networks on text-to-image generation, the generation of complex scenes with multiple objects in the complicated background remains challenging; moreover, end-to-end text-to-image generation still suffers from poor image quality. In this work, we propose a sequential algorithm of text-to-image generation, which allows synthesizing high-quality images (more than 1024x1024 pixels). The proposed approach consists of location inference, key objects extraction, image search, layout generation, and image harmonization stages. We compare the suggested approach with state-of-the-art image generation model DALL-E with text-to-image mapping. Our approach demonstrates the effectiveness and visual plausibility of the generated images based on golden section layouts.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Automatic Generation of Text-Based Open Cloze Exercises
    Malafeev, Alexey
    ANALYSIS OF IMAGES, SOCIAL NETWORKS AND TEXTS, 2014, 436 : 140 - 151
  • [22] Implementation and Comparison of Text-Based Image Retrieval Schemes
    Zaidi, Syed Ali Jafar
    Buriro, Attaullah
    Riaz, Mohammad
    Mahoob, Athar
    Riaz, Mohammad Noman
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (01) : 611 - 618
  • [23] A voting approach for image binarization of text-based documents
    Boiangiu, Costin-Anton
    Vlăsceanu, Giorgiana Violeta
    Atanasiu, Alexandru Marian
    Damian, Petrișor Alin
    Panaitescu, Cristian
    UPB Scientific Bulletin, Series C: Electrical Engineering and Computer Science, 2019, 81 (03): : 53 - 64
  • [24] Meta Perturbation Generation Network for Text-Based CAPTCHA
    Wu, Zhuoting
    Guo, Zhiwei
    You, Jiuxiang
    Yang, Zhenguo
    Li, Qing
    Liu, Wenyin
    SECURITY AND PRIVACY IN COMMUNICATION NETWORKS, PT I, SECURECOMM 2023, 2025, 567 : 110 - 124
  • [25] Global Image Analysis to Determine Suitability for Text-Based Image Personalization
    Ding, Hengzhou
    Bala, Raja
    Fan, Zhigang
    Bouman, Charles A.
    Allebach, Jan P.
    IMAGING AND PRINTING IN A WEB 2.0 WORLD III, 2012, 8302
  • [26] Text-based Person Search without Parallel Image-Text Data
    Bai, Yang
    Wang, Jingyao
    Cao, Min
    Chen, Chen
    Cao, Ziqiang
    Nie, Liqiang
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 757 - 767
  • [27] Probabilistic Image Tagging with Tags Expanded By Text-Based Search
    Zhang, Xiaoming
    Huang, Zi
    Shen, Heng Tao
    Li, Zhoujun
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT I, 2011, 6587 : 269 - +
  • [28] Old fashion text-based image retrieval using FCA
    Ahmad, I
    Jang, TS
    2003 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL 3, PROCEEDINGS, 2003, : 33 - 36
  • [29] Text-based approaches for non-topical image categorization
    Sable C.L.
    Hatzivassiloglou V.
    International Journal on Digital Libraries, 2000, 3 (3) : 261 - 275
  • [30] Towards a Text-Based Quantitative and Explainable Histopathology Image Analysis
    Anh Tien Nguyen
    Trinh Thi Le Vuong
    Kwak, Jin Tae
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT IV, 2024, 15004 : 514 - 524