Style Aligned Image Generation via Shared Attention

被引:4
|
作者
Hertz, Amir [1 ]
Voynov, Andrey [1 ]
Fruchter, Shlomi [1 ]
Cohen-Or, Daniel [1 ,2 ]
机构
[1] Google Res, Mountain View, CA 94043 USA
[2] Tel Aviv Univ, Tel Aviv, Israel
关键词
D O I
10.1109/CVPR52733.2024.00457
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large-scale Text-to-Image (T2I) models have rapidly gained prominence across creative fields, generating visually compelling outputs from textual prompts. However, controlling these models to ensure consistent style remains challenging, with existing methods necessitating fine-tuning and manual intervention to disentangle content and style. In this paper, we introduce StyleAligned, a novel technique designed to establish style alignment among a series of generated images. By employing minimal 'attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models. This approach allows for the creation of style-consistent images using a reference style through a straightforward inversion operation. Our method's evaluation across diverse styles and text prompts demonstrates high-quality synthesis and fidelity, underscoring its efficacy in achieving consistent style across various inputs.
引用
收藏
页码:4775 / 4785
页数:11
相关论文
共 50 条
  • [1] Style-exprGAN: Diverse Smile Style Image Generation Via Attention-Guided Adversarial Networks
    Tu, Ching-Ting
    Chen, Kuan-Lin
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (03) : 1190 - 1201
  • [2] Progressive and Aligned Pose Attention Transfer for Person Image Generation
    Zhu, Zhen
    Huang, Tengteng
    Xu, Mengde
    Shi, Baoguang
    Cheng, Wenqing
    Bai, Xiang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (08) : 4306 - 4320
  • [3] Adaptively Aligned Image Captioning via Adaptive Attention Time
    Huang, Lun
    Wang, Wenmin
    Xia, Yaxian
    Chen, Jie
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [4] Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation
    Kwon, Gihyun
    Ye, Jong Chul
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13960 - 13969
  • [5] Attention-Aligned Transformer for Image Captioning
    Fei, Zhengcong
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 607 - 615
  • [6] Chinese Image Caption Generation via Visual Attention and Topic Modeling
    Liu, Maofu
    Hu, Huijun
    Li, Lingjun
    Yu, Yan
    Guan, Weili
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (02) : 1247 - 1257
  • [7] Few-Shot Image Generation via Style Adaptation and Content Preservation
    He, Xiaosheng
    Yang, Fan
    Liu, Fayao
    Lin, Guosheng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [8] Robust One-Shot Segmentation of Brain Tissues via Image-Aligned Style Transformation
    Lv, Jinxin
    Zeng, Xiaoyu
    Wang, Sheng
    Duan, Ran
    Wang, Zhiwei
    Li, Qiang
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 1861 - 1869
  • [9] Medical (CT) Image Generation with Style
    Krishna, Arjun
    Mueller, Klaus
    15TH INTERNATIONAL MEETING ON FULLY THREE-DIMENSIONAL IMAGE RECONSTRUCTION IN RADIOLOGY AND NUCLEAR MEDICINE, 2019, 11072
  • [10] Multihead Attention-based Audio Image Generation with Cross-Modal Shared Weight Classifier
    Xu, Yiming
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,