Hierarchical Image Generation via Transformer-Based Sequential Patch Selection

被引：0

作者：

Xu, Xiaogang ^{[1
]}

Xu, Ning ^{[2
]}

机构：

[1] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China

[2] Adobe Res, San Jose, CA USA

来源：

THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2022年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

To synthesize images with preferred objects and interactions, a controllable way is to generate the image from a scene graph and a large pool of object crops, where the spatial arrangements of the objects in the image are defined by the scene graph while their appearances are determined by the retrieved crops from the pool. In this paper, we propose a novel framework with such a semi-parametric generation strategy. First, to encourage the retrieval of mutually compatible crops, we design a sequential selection strategy where the crop selection for each object is determined by the contents and locations of all object crops that have been chosen previously. Such process is implemented via a transformer trained with contrastive losses. Second, to generate the final image, our hierarchical generation strategy leverages hierarchical gated convolutions which are employed to synthesize areas not covered by any image crops, and a patch-guided spatially adaptive normalization module which is proposed to guarantee the final generated images complying with the crop appearance and the scene graph. Evaluated on the challenging Visual Genome and COCO-Stuff dataset, our experimental results demonstrate the superiority of our proposed method over existing state-of-the-art methods.

引用

页码：2938 / 2945

页数：8

共 50 条

[21] OPTICAL SATELLITE IMAGE CHANGE DETECTION VIA TRANSFORMER-BASED SIAMESE NETWORK
Wu, Yang
Wang, Yuyao
Li, Yanheng
Xu, Qizhi
2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 1436 - 1439
[22] Exploring Visual Relationships via Transformer-based Graphs for Enhanced Image Captioning
Li, Jingyu
Mao, Zhendong
Li, Hao
Chen, Weidong
Zhang, Yongdong
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (05)
[23] Reranking for Efficient Transformer-based Answer Selection
Matsubara, Yoshitomo
Vu, Thuy
Moschitti, Alessandro
PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1577 - 1580
[24] Transformer-Based Rating-Aware Sequential Recommendation
Li, Yang
Li, Qianmu
Meng, Shunmei
Hou, Jun
ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2021, PT I, 2022, 13155 : 759 - 774
[25] TRANSFORMER-BASED HIERARCHICAL CLUSTERING FOR BRAIN NETWORK ANALYSIS
Dai, Wei
Cui, Hejie
Kan, Xuan
Guo, Ying
Van Rooij, Sanne
Yang, Carl
2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
[26] A Sparse Transformer-Based Approach for Image Captioning
Lei, Zhou
Zhou, Congcong
Chen, Shengbo
Huang, Yiyong
Liu, Xianrui
IEEE ACCESS, 2020, 8 : 213437 - 213446
[27] A Sparse Transformer-Based Approach for Image Captioning
Lei, Zhou
Zhou, Congcong
Chen, Shengbo
Huang, Yiyong
Liu, Xianrui
IEEE Access, 2020, 8 : 213437 - 213446
[28] Transformer-based Extraction of Deep Image Models
Battis, Verena
Penner, Alexander
2022 IEEE 7TH EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY (EUROS&P 2022), 2022, : 320 - 336
[29] ThaiTC:Thai Transformer-based Image Captioning
Jaknamon, Teetouch
Marukatat, Sanparith
2022 17TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING (ISAI-NLP 2022) / 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INTERNET OF THINGS (AIOT 2022), 2022,
[30] A Transformer-Based Variational Autoencoder for Sentence Generation
Liu, Danyang
Liu, Gongshen
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,

← 1 2 3 4 5 →