Style Aligned Image Generation via Shared Attention

被引：4

作者：

Hertz, Amir ^{[1
]}

Voynov, Andrey ^{[1
]}

Fruchter, Shlomi ^{[1
]}

Cohen-Or, Daniel ^{[1
,2
]}

机构：

[1] Google Res, Mountain View, CA 94043 USA

[2] Tel Aviv Univ, Tel Aviv, Israel

来源：

2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024 | 2024年

关键词：

D O I：

10.1109/CVPR52733.2024.00457

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large-scale Text-to-Image (T2I) models have rapidly gained prominence across creative fields, generating visually compelling outputs from textual prompts. However, controlling these models to ensure consistent style remains challenging, with existing methods necessitating fine-tuning and manual intervention to disentangle content and style. In this paper, we introduce StyleAligned, a novel technique designed to establish style alignment among a series of generated images. By employing minimal 'attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models. This approach allows for the creation of style-consistent images using a reference style through a straightforward inversion operation. Our method's evaluation across diverse styles and text prompts demonstrates high-quality synthesis and fidelity, underscoring its efficacy in achieving consistent style across various inputs.

引用

页码：4775 / 4785

页数：11

共 50 条

[41] Artistic image adversarial attack via style perturbation
Haiyan Zhang
Quan Wang
Guorui Feng
Multimedia Systems, 2023, 29 (6) : 3745 - 3755
[42] Edge Enhanced Image Style Transfer via Transformers
Zhang, Chiyu
Dai, Zaiyan
Cao, Peng
Yang, Jun
PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 105 - 114
[43] Image super-resolution via channel attention and spatial attention
Enmin Lu
Xiaoxiao Hu
Applied Intelligence, 2022, 52 : 2260 - 2268
[44] Artistic image adversarial attack via style perturbation
Zhang, Haiyan
Wang, Quan
Feng, Guorui
MULTIMEDIA SYSTEMS, 2023, 29 (06) : 3745 - 3755
[45] Image super-resolution via channel attention and spatial attention
Lu, Enmin
Hu, Xiaoxiao
APPLIED INTELLIGENCE, 2022, 52 (02) : 2260 - 2268
[46] Robust Image Retargeting via Axis-Aligned Deformation
Panozzo, Daniele
Weber, Ofir
Sorkine, Olga
COMPUTER GRAPHICS FORUM, 2012, 31 (02) : 229 - 236
[47] Latent Style: multi-style image transfer via latent style coding and skip connection
Hu, Jingfei
Wu, Guang
Wang, Hua
Zhang, Jicong
SIGNAL IMAGE AND VIDEO PROCESSING, 2022, 16 (02) : 359 - 368
[48] Latent Style: multi-style image transfer via latent style coding and skip connection
Jingfei Hu
Guang Wu
Hua Wang
Jicong Zhang
Signal, Image and Video Processing, 2022, 16 : 359 - 368
[49] SEMGRASP: Semantic Grasp Generation via Language Aligned Discretization
Li, Kailin
Wang, Jingbo
Yang, Lixin
Lu, Cewu
Dai, Bo
COMPUTER VISION - ECCV 2024, PT II, 2025, 15060 : 109 - 127
[50] HIGSA: Human image generation with self-attention
Wu, Haoran
He, Fazhi
Si, Tongzhen
Duan, Yansong
Yan, Xiaohu
ADVANCED ENGINEERING INFORMATICS, 2023, 55

← 1 2 3 4 5 →