BI-LSTM Based Encoding and GAN for Text-to-Image Synthesis

被引:1
|
作者
Talasila, Vamsidhar [1 ]
Narasingarao, M. R. [1 ,2 ]
机构
[1] Koneru Lakshmaiah Educ Fdn, Dept Comp Sci & Engn, Vaddeswaram, Andhra Pradesh, India
[2] GITAM Univ, Dept Comp Sci & Engn, Visakhapatnam, Andhra Pradesh, India
来源
SENSING AND IMAGING | 2022年 / 23卷 / 01期
关键词
Text-to-image; Image encoder; Cross modal; BI-LSTM; GAN model;
D O I
10.1007/s11220-022-00390-6
中图分类号
TH7 [仪器、仪表];
学科分类号
0804 ; 080401 ; 081102 ;
摘要
Synthesizing images from text is to produce images with reliable content as specified text depiction that is an extremely demanding task with the most important problems like: content consistency and visual realism. Owing to considerable progression of GAN, it is now possible to produce images with good visual certainty. The translation of text descriptions to images with higher content reliability, on the other hand, is still a work in progress. This paper intends to frame a novel text-to-image synthesis approach, which includes two major phases namely; (1) Text to image encoding and (2) GAN. Initially, during text to image encoding, cross modal feature alignment takes place including text and image features. Consequently, BI-LSTM is deployed to transfer the text embedding to feature vector. At second stage, the image is synthesized based on the encoding. Consequently, text feature group are given as input to GAN, which offers the final synthesized images. Finally, the supremacy of developed approach is examined via evaluation over extant techniques.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] BI-LSTM Based Encoding and GAN for Text-to-Image Synthesis
    Vamsidhar Talasila
    M. R. Narasingarao
    [J]. Sensing and Imaging, 2022, 23
  • [2] Text Extraction with Optimal Bi-LSTM
    Nayef, Bahera H.
    Abdullah, Siti Norul Huda Sheikh
    Sulaiman, Rossilawati
    Saeed, Ashwaq Mukred
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 76 (03): : 3548 - 3566
  • [3] Modified GAN with Proposed Feature Set for Text-to-Image Synthesis
    Talasila, Vamsidhar
    Narasingarao, M. R.
    Mohan, V. Murali
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2023, 37 (04)
  • [4] Cycle-Consistent Inverse GAN for Text-to-Image Synthesis
    Wang, Hao
    Lin, Guosheng
    Hoi, Steven C. H.
    Miao, Chunyan
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 630 - 638
  • [5] Limited text speech synthesis with electroglottograph based on Bi-LSTM and modified Tacotron-2
    Chen, Lijiang
    Ren, Jie
    Chen, Pengfei
    Mao, Xia
    Zhao, Qi
    [J]. APPLIED INTELLIGENCE, 2022, 52 (13) : 15193 - 15209
  • [6] Limited text speech synthesis with electroglottograph based on Bi-LSTM and modified Tacotron-2
    Lijiang Chen
    Jie Ren
    Pengfei Chen
    Xia Mao
    Qi Zhao
    [J]. Applied Intelligence, 2022, 52 : 15193 - 15209
  • [7] Text multi-label sentiment analysis based on Bi-LSTM
    Hu, Junlin
    Kang, Xin
    Nishide, Shun
    Ren, Fuji
    [J]. PROCEEDINGS OF 2019 6TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2019, : 16 - 20
  • [8] Joint Embedding based Text-to-Image Synthesis
    Wang, Menglan
    Yu, Yue
    Li, Benyuan
    [J]. 2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 432 - 436
  • [9] CookGAN: Causality based Text-to-Image Synthesis
    Zhu, Bin
    Ngo, Chong-Wah
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5518 - 5526
  • [10] SCENE RETRIEVAL FOR VIDEO SUMMARIZATION BASED ON TEXT-TO-IMAGE GAN
    Yanagi, Rintaro
    Togo, Ren
    Ogawa, Takahiro
    Haseyama, Miki
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 1825 - 1829