Hierarchical Image Generation via Transformer-Based Sequential Patch Selection

被引:0
|
作者
Xu, Xiaogang [1 ]
Xu, Ning [2 ]
机构
[1] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[2] Adobe Res, San Jose, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To synthesize images with preferred objects and interactions, a controllable way is to generate the image from a scene graph and a large pool of object crops, where the spatial arrangements of the objects in the image are defined by the scene graph while their appearances are determined by the retrieved crops from the pool. In this paper, we propose a novel framework with such a semi-parametric generation strategy. First, to encourage the retrieval of mutually compatible crops, we design a sequential selection strategy where the crop selection for each object is determined by the contents and locations of all object crops that have been chosen previously. Such process is implemented via a transformer trained with contrastive losses. Second, to generate the final image, our hierarchical generation strategy leverages hierarchical gated convolutions which are employed to synthesize areas not covered by any image crops, and a patch-guided spatially adaptive normalization module which is proposed to guarantee the final generated images complying with the crop appearance and the scene graph. Evaluated on the challenging Visual Genome and COCO-Stuff dataset, our experimental results demonstrate the superiority of our proposed method over existing state-of-the-art methods.
引用
收藏
页码:2938 / 2945
页数:8
相关论文
共 50 条
  • [21] OPTICAL SATELLITE IMAGE CHANGE DETECTION VIA TRANSFORMER-BASED SIAMESE NETWORK
    Wu, Yang
    Wang, Yuyao
    Li, Yanheng
    Xu, Qizhi
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 1436 - 1439
  • [22] Exploring Visual Relationships via Transformer-based Graphs for Enhanced Image Captioning
    Li, Jingyu
    Mao, Zhendong
    Li, Hao
    Chen, Weidong
    Zhang, Yongdong
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (05)
  • [23] Reranking for Efficient Transformer-based Answer Selection
    Matsubara, Yoshitomo
    Vu, Thuy
    Moschitti, Alessandro
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1577 - 1580
  • [24] Transformer-Based Rating-Aware Sequential Recommendation
    Li, Yang
    Li, Qianmu
    Meng, Shunmei
    Hou, Jun
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2021, PT I, 2022, 13155 : 759 - 774
  • [25] TRANSFORMER-BASED HIERARCHICAL CLUSTERING FOR BRAIN NETWORK ANALYSIS
    Dai, Wei
    Cui, Hejie
    Kan, Xuan
    Guo, Ying
    Van Rooij, Sanne
    Yang, Carl
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [26] A Sparse Transformer-Based Approach for Image Captioning
    Lei, Zhou
    Zhou, Congcong
    Chen, Shengbo
    Huang, Yiyong
    Liu, Xianrui
    IEEE ACCESS, 2020, 8 : 213437 - 213446
  • [27] A Sparse Transformer-Based Approach for Image Captioning
    Lei, Zhou
    Zhou, Congcong
    Chen, Shengbo
    Huang, Yiyong
    Liu, Xianrui
    IEEE Access, 2020, 8 : 213437 - 213446
  • [28] Transformer-based Extraction of Deep Image Models
    Battis, Verena
    Penner, Alexander
    2022 IEEE 7TH EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY (EUROS&P 2022), 2022, : 320 - 336
  • [29] ThaiTC:Thai Transformer-based Image Captioning
    Jaknamon, Teetouch
    Marukatat, Sanparith
    2022 17TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING (ISAI-NLP 2022) / 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INTERNET OF THINGS (AIOT 2022), 2022,
  • [30] A Transformer-Based Variational Autoencoder for Sentence Generation
    Liu, Danyang
    Liu, Gongshen
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,