Hierarchical Image Generation via Transformer-Based Sequential Patch Selection

被引:0
|
作者
Xu, Xiaogang [1 ]
Xu, Ning [2 ]
机构
[1] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[2] Adobe Res, San Jose, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To synthesize images with preferred objects and interactions, a controllable way is to generate the image from a scene graph and a large pool of object crops, where the spatial arrangements of the objects in the image are defined by the scene graph while their appearances are determined by the retrieved crops from the pool. In this paper, we propose a novel framework with such a semi-parametric generation strategy. First, to encourage the retrieval of mutually compatible crops, we design a sequential selection strategy where the crop selection for each object is determined by the contents and locations of all object crops that have been chosen previously. Such process is implemented via a transformer trained with contrastive losses. Second, to generate the final image, our hierarchical generation strategy leverages hierarchical gated convolutions which are employed to synthesize areas not covered by any image crops, and a patch-guided spatially adaptive normalization module which is proposed to guarantee the final generated images complying with the crop appearance and the scene graph. Evaluated on the challenging Visual Genome and COCO-Stuff dataset, our experimental results demonstrate the superiority of our proposed method over existing state-of-the-art methods.
引用
收藏
页码:2938 / 2945
页数:8
相关论文
共 50 条
  • [41] Optimizing transformer-based network via advanced decoder design for medical image segmentation
    Yang, Weibin
    Dong, Zhiqi
    Xu, Mingyuan
    Xu, Longwei
    Geng, Dehua
    Li, Yusong
    Wang, Pengwei
    BIOMEDICAL PHYSICS & ENGINEERING EXPRESS, 2025, 11 (02):
  • [42] Disentangled Multimodal Brain MR Image Translation via Transformer-based Modality Infuser
    Cho, Jihoon
    Liu, Xiaofeng
    Xing, Fangxu
    Ouyang, Jinsong
    El Fakhri, Georges
    Park, Jinah
    Woo, Jonghye
    MEDICAL IMAGING 2024: IMAGE PROCESSING, 2024, 12926
  • [43] Transformer-Based Image Inpainting Detection via Label Decoupling and Constrained Adversarial Training
    Li, Yuanman
    Hu, Liangpei
    Dong, Li
    Wu, Haiwei
    Tian, Jinyu
    Zhou, Jiantao
    Li, Xia
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (03) : 1857 - 1872
  • [44] Transformer-Based No-Reference Image Quality Assessment via Supervised Contrastive Learning
    Shi, Jinsong
    Gao, Pan
    Qin, Jie
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5, 2024, : 4829 - 4837
  • [45] Improving Transformer-based Sequential Recommenders through Preference Editing
    Ma, Muyang
    Ren, Pengjie
    Chen, Zhumin
    Ren, Zhaochun
    Liang, Huasheng
    Ma, Jun
    De Rijke, Maarten
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (03)
  • [46] Personalization Through User Attributes for Transformer-Based Sequential Recommendation
    Fischer, Elisabeth
    Dallmann, Alexander
    Hotho, Andreas
    RECOMMENDER SYSTEMS IN FASHION AND RETAIL, 2023, 981 : 25 - 43
  • [47] A Transformer-based Multi-Platform Sequential Estimation Fusion
    Zhai, Xupeng
    Yang, Yanbo
    Liu, Zhunga
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 144
  • [48] VulExplainer: A Transformer-Based Hierarchical Distillation for Explaining Vulnerability Types
    Fu, Michael
    Nguyen, Van
    Tantithamthavorn, Chakkrit
    Le, Trung
    Phung, Dinh
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (10) : 4550 - 4565
  • [49] Towards Hierarchical Regional Transformer-based Multiple Instance Learning
    Cersovsky, Josef
    Mohammadi, Sadegh
    Kainmueller, Dagmar
    Hoehne, Johannes
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 3954 - 3962
  • [50] A hierarchical transformer-based network for multivariate time series classification
    Tang, Yingxia
    Wei, Yanxuan
    Li, Teng
    Zheng, Xiangwei
    Ji, Cun
    INFORMATION SYSTEMS, 2025, 132