Style Aligned Image Generation via Shared Attention

被引：4

作者：

Hertz, Amir ^{[1
]}

Voynov, Andrey ^{[1
]}

Fruchter, Shlomi ^{[1
]}

Cohen-Or, Daniel ^{[1
,2
]}

机构：

[1] Google Res, Mountain View, CA 94043 USA

[2] Tel Aviv Univ, Tel Aviv, Israel

来源：

2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024 | 2024年

关键词：

D O I：

10.1109/CVPR52733.2024.00457

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large-scale Text-to-Image (T2I) models have rapidly gained prominence across creative fields, generating visually compelling outputs from textual prompts. However, controlling these models to ensure consistent style remains challenging, with existing methods necessitating fine-tuning and manual intervention to disentangle content and style. In this paper, we introduce StyleAligned, a novel technique designed to establish style alignment among a series of generated images. By employing minimal 'attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models. This approach allows for the creation of style-consistent images using a reference style through a straightforward inversion operation. Our method's evaluation across diverse styles and text prompts demonstrates high-quality synthesis and fidelity, underscoring its efficacy in achieving consistent style across various inputs.

引用

页码：4775 / 4785

页数：11

共 50 条

[1] Style-exprGAN: Diverse Smile Style Image Generation Via Attention-Guided Adversarial Networks
Tu, Ching-Ting
Chen, Kuan-Lin
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (03) : 1190 - 1201
[2] Progressive and Aligned Pose Attention Transfer for Person Image Generation
Zhu, Zhen
Huang, Tengteng
Xu, Mengde
Shi, Baoguang
Cheng, Wenqing
Bai, Xiang
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (08) : 4306 - 4320
[3] Adaptively Aligned Image Captioning via Adaptive Attention Time
Huang, Lun
Wang, Wenmin
Xia, Yaxian
Chen, Jie
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[4] Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation
Kwon, Gihyun
Ye, Jong Chul
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13960 - 13969
[5] Attention-Aligned Transformer for Image Captioning
Fei, Zhengcong
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 607 - 615
[6] Chinese Image Caption Generation via Visual Attention and Topic Modeling
Liu, Maofu
Hu, Huijun
Li, Lingjun
Yu, Yan
Guan, Weili
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (02) : 1247 - 1257
[7] Few-Shot Image Generation via Style Adaptation and Content Preservation
He, Xiaosheng
Yang, Fan
Liu, Fayao
Lin, Guosheng
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
[8] Robust One-Shot Segmentation of Brain Tissues via Image-Aligned Style Transformation
Lv, Jinxin
Zeng, Xiaoyu
Wang, Sheng
Duan, Ran
Wang, Zhiwei
Li, Qiang
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 1861 - 1869
[9] Medical (CT) Image Generation with Style
Krishna, Arjun
Mueller, Klaus
15TH INTERNATIONAL MEETING ON FULLY THREE-DIMENSIONAL IMAGE RECONSTRUCTION IN RADIOLOGY AND NUCLEAR MEDICINE, 2019, 11072
[10] Multihead Attention-based Audio Image Generation with Cross-Modal Shared Weight Classifier
Xu, Yiming
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,

← 1 2 3 4 5 →