Localization and Manipulation of Immoral Visual Cues for Safe Text-to-Image Generation

被引：0

作者：

Park, Seongbeom ^{[1
]}

Moon, Suhong ^{[2
]}

Park, Seunghyun ^{[3
]}

Kim, Jinkyu ^{[1
]}

机构：

[1] Korea Univ, CSE, Seoul, South Korea

[2] Univ Calif Berkeley, EECS, Berkeley, CA USA

[3] NAVER Cloud AI, Seoul, South Korea

来源：

2024 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION, WACV 2024 | 2024年

关键词：

D O I：

10.1109/WACV57701.2024.00461

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Current text-to-image generation methods produce high-resolution and high-quality images, but they should not produce immoral images that may contain inappropriate content from the perspective of commonsense morality. Conventional approaches, however, often neglect these ethical concerns, and existing solutions are often limited to ensure moral compatibility. To address this, we propose a novel method that has three main capabilities: (1) our model recognizes the degree of visual commonsense immorality of a given generated image, (2) our model localizes immoral visual (and textual) attributes that make the image visually immoral, and (3) our model manipulates such immoral visual cues into a morally-qualifying alternative. We conduct experiments with various text-to-image generation models, including the state-of-the-art Stable Diffusion model, demonstrating the efficacy of our ethical image manipulation approach. Our human study further confirms that ours is indeed able to generate morally-satisfying images from immoral ones.

引用

页码：4663 / 4672

页数：10

共 50 条

[21] StyleDrop: Text-to-Image Generation in Any Style
Sohn, Kihyuk
Ruiz, Nataniel
Lee, Kimin
Chin, Daniel Castro
Blok, Irina
Chang, Huiwen
Barber, Jarred
Jiang, Lu
Entis, Glenn
Li, Yuanzhen
Hao, Yuan
Essa, Irfan
Rubinstein, Michael
Krishnan, Dilip
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[22] A taxonomy of prompt modifiers for text-to-image generation
Oppenlaender, Jonas
BEHAVIOUR & INFORMATION TECHNOLOGY, 2024, 43 (15) : 3763 - 3776
[23] Text-to-Image Generation Method Based on Image-Text Semantic Consistency
Xue Z.
Xu Z.
Lang C.
Feng S.
Wang T.
Li Y.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (09): : 2180 - 2190
[24] Large-scale Text-to-Image Generation Models for Visual Artists' Creative Works
Ko, Hyung-Kwon
Park, Gwanmo
Jeon, Hyeon
Jo, Jaemin
Kim, Juho
Seo, Jinwook
PROCEEDINGS OF 2023 28TH ANNUAL CONFERENCE ON INTELLIGENT USER INTERFACES, IUI 2023, 2023, : 919 - 933
[25] Locally controllable network based on visual–linguistic relation alignment for text-to-image generation
Zaike Li
Li Liu
Huaxiang Zhang
Dongmei Liu
Yu Song
Boqun Li
Multimedia Systems, 2024, 30
[26] Generative adversarial text-to-image generation with style image constraint
Zekang Wang
Li Liu
Huaxiang Zhang
Dongmei Liu
Yu Song
Multimedia Systems, 2023, 29 : 3291 - 3303
[27] Generative adversarial text-to-image generation with style image constraint
Wang, Zekang
Liu, Li
Zhang, Huaxiang
Liu, Dongmei
Song, Yu
MULTIMEDIA SYSTEMS, 2023, 29 (06) : 3291 - 3303
[28] Unleashing Text-to-Image Diffusion Models for Visual Perception
Zhao, Wenliang
Rao, Yongming
Liu, Zuyan
Liu, Benlin
Zhou, Jie
Lu, Jiwen
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 5706 - 5716
[29] Improving text-to-image generation with object layout guidance
Jezia Zakraoui
Moutaz Saleh
Somaya Al-Maadeed
Jihad Mohammed Jaam
Multimedia Tools and Applications, 2021, 80 : 27423 - 27443
[30] Variational Distribution Learning for Unsupervised Text-to-Image Generation
Kang, Minsoo
Lee, Doyup
Kim, Jiseob
Kim, Saehoon
Han, Bohyung
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23380 - 23389

← 1 2 3 4 5 →