Style Aligned Image Generation via Shared Attention

被引:4
|
作者
Hertz, Amir [1 ]
Voynov, Andrey [1 ]
Fruchter, Shlomi [1 ]
Cohen-Or, Daniel [1 ,2 ]
机构
[1] Google Res, Mountain View, CA 94043 USA
[2] Tel Aviv Univ, Tel Aviv, Israel
来源
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024 | 2024年
关键词
D O I
10.1109/CVPR52733.2024.00457
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large-scale Text-to-Image (T2I) models have rapidly gained prominence across creative fields, generating visually compelling outputs from textual prompts. However, controlling these models to ensure consistent style remains challenging, with existing methods necessitating fine-tuning and manual intervention to disentangle content and style. In this paper, we introduce StyleAligned, a novel technique designed to establish style alignment among a series of generated images. By employing minimal 'attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models. This approach allows for the creation of style-consistent images using a reference style through a straightforward inversion operation. Our method's evaluation across diverse styles and text prompts demonstrates high-quality synthesis and fidelity, underscoring its efficacy in achieving consistent style across various inputs.
引用
收藏
页码:4775 / 4785
页数:11
相关论文
共 50 条
  • [41] Artistic image adversarial attack via style perturbation
    Haiyan Zhang
    Quan Wang
    Guorui Feng
    Multimedia Systems, 2023, 29 (6) : 3745 - 3755
  • [42] Edge Enhanced Image Style Transfer via Transformers
    Zhang, Chiyu
    Dai, Zaiyan
    Cao, Peng
    Yang, Jun
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 105 - 114
  • [43] Image super-resolution via channel attention and spatial attention
    Enmin Lu
    Xiaoxiao Hu
    Applied Intelligence, 2022, 52 : 2260 - 2268
  • [44] Artistic image adversarial attack via style perturbation
    Zhang, Haiyan
    Wang, Quan
    Feng, Guorui
    MULTIMEDIA SYSTEMS, 2023, 29 (06) : 3745 - 3755
  • [45] Image super-resolution via channel attention and spatial attention
    Lu, Enmin
    Hu, Xiaoxiao
    APPLIED INTELLIGENCE, 2022, 52 (02) : 2260 - 2268
  • [46] Robust Image Retargeting via Axis-Aligned Deformation
    Panozzo, Daniele
    Weber, Ofir
    Sorkine, Olga
    COMPUTER GRAPHICS FORUM, 2012, 31 (02) : 229 - 236
  • [47] Latent Style: multi-style image transfer via latent style coding and skip connection
    Hu, Jingfei
    Wu, Guang
    Wang, Hua
    Zhang, Jicong
    SIGNAL IMAGE AND VIDEO PROCESSING, 2022, 16 (02) : 359 - 368
  • [48] Latent Style: multi-style image transfer via latent style coding and skip connection
    Jingfei Hu
    Guang Wu
    Hua Wang
    Jicong Zhang
    Signal, Image and Video Processing, 2022, 16 : 359 - 368
  • [49] SEMGRASP: Semantic Grasp Generation via Language Aligned Discretization
    Li, Kailin
    Wang, Jingbo
    Yang, Lixin
    Lu, Cewu
    Dai, Bo
    COMPUTER VISION - ECCV 2024, PT II, 2025, 15060 : 109 - 127
  • [50] HIGSA: Human image generation with self-attention
    Wu, Haoran
    He, Fazhi
    Si, Tongzhen
    Duan, Yansong
    Yan, Xiaohu
    ADVANCED ENGINEERING INFORMATICS, 2023, 55