Concept Weaver: Enabling Multi-Concept Fusion in Text-to-Image Models

被引：0

作者：

Kwon, Gihyun ^{[1
,2
]}

Jenni, Simon ^{[2
]}

Li, Dingzeyu ^{[2
]}

Lee, Joon-Young ^{[2
]}

Ye, Jong Chul ^{[1
]}

Heilbron, Fabian Caba ^{[2
]}

机构：

[1] Korea Adv Inst Sci & Technol, Daejeon, South Korea

[2] Adobe, San Jose, CA 95110 USA

来源：

2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024 | 2024年

关键词：

D O I：

10.1109/CVPR52733.2024.00848

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

While there has been significant progress in customizing text-to-image generation models, generating images that combine multiple personalized concepts remains challenging. In this work, we introduce Concept Weaver, a method for composing customized text-to-image diffusion models at inference time. Specifically, the method breaks the process into two steps: creating a template image aligned with the semantics of input prompts, and then personalizing the template using a concept fusion strategy. The fusion strategy incorporates the appearance of the target concepts into the template image while retaining its structural details. The results indicate that our method can generate multiple custom concepts with higher identity fidelity compared to alternative approaches. Furthermore, the method is shown to seamlessly handle more than two concepts and closely follow the semantic meaning of the input prompt without blending appearances across different subjects.

引用

页码：8880 / 8889

页数：10

共 50 条

[31] Evaluating Data Attribution for Text-to-Image Models
Wang, Sheng-Yu
Efros, Alexei A.
Zhu, Jun-Yan
Zhang, Richard
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 7158 - 7169
[32] Multilingual Conceptual Coverage in Text-to-Image Models
Saxon, Michael
Wang, William Yang
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 4831 - 4848
[33] Ablating Concepts in Text-to-Image Diffusion Models
Kumari, Nupur
Zhang, Bingliang
Wang, Sheng-Yu
Shechtman, Eli
Zhang, Richard
Zhu, Jun-Yan
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22634 - 22645
[34] Resolving Ambiguities in Text-to-Image Generative Models
Mehrabi, Ninareh
Goyal, Palash
Verma, Apurv
Dhamala, Jwala
Kumar, Varun
Hu, Qian
Chang, Kai-Wei
Zemel, Richard
Galstyan, Aram
Gupta, Rahul
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 14367 - 14388
[35] RECON: Training-Free Acceleration for Text-to-Image Synthesis with Retrieval of Concept Prompt Trajectories
Lu, Chen-Yi
Agarwal, Shubham
Tanjim, Md Mehrab
Mahadik, Kanak
Rao, Anup
Mitra, Subrata
Saini, Shiv Kumar
Bagchi, Saurabh
Chaterji, Somali
COMPUTER VISION - ECCV 2024, PT LIX, 2025, 15117 : 288 - 306
[36] A benchmark test suite for evolutionary multi-objective multi-concept optimization
Niloy, Rounak Saha
Singh, Hemant Kumar
Ray, Tapabrata
Swarm and Evolutionary Computation, 2024, 84
[37] Typology of Risks of Generative Text-to-Image Models
Bird, Charlotte
Ungless, Eddie L.
Kasirzadeh, Atoosa
PROCEEDINGS OF THE 2023 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2023, 2023, : 396 - 410
[38] SneakyPrompt: Jailbreaking Text-to-image Generative Models
Yang, Yuchen
Hui, Bo
Yuan, Haolin
Gong, Neil
Cao, Yinzhi
45TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP 2024, 2024, : 897 - 912
[39] Multi-concept Mining for Video Captioning Based on Multiple Tasks
Zhang, Qinyu
Tang, Pengjie
Wang, Hanli
Gu, Jinjing
2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 2993 - 2997
[40] A Multi-Concept Semantic Representation System for Chinese Intent Recognition
Yu, Shan
Liu, Pengyuan
Liu, Hua
Chen, Kaiyi
2024 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, IALP 2024, 2024, : 233 - 238

← 1 2 3 4 5 →