Localization and Manipulation of Immoral Visual Cues for Safe Text-to-Image Generation

被引：0

作者：

Park, Seongbeom ^{[1
]}

Moon, Suhong ^{[2
]}

Park, Seunghyun ^{[3
]}

Kim, Jinkyu ^{[1
]}

机构：

[1] Korea Univ, CSE, Seoul, South Korea

[2] Univ Calif Berkeley, EECS, Berkeley, CA USA

[3] NAVER Cloud AI, Seoul, South Korea

来源：

2024 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION, WACV 2024 | 2024年

关键词：

D O I：

10.1109/WACV57701.2024.00461

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Current text-to-image generation methods produce high-resolution and high-quality images, but they should not produce immoral images that may contain inappropriate content from the perspective of commonsense morality. Conventional approaches, however, often neglect these ethical concerns, and existing solutions are often limited to ensure moral compatibility. To address this, we propose a novel method that has three main capabilities: (1) our model recognizes the degree of visual commonsense immorality of a given generated image, (2) our model localizes immoral visual (and textual) attributes that make the image visually immoral, and (3) our model manipulates such immoral visual cues into a morally-qualifying alternative. We conduct experiments with various text-to-image generation models, including the state-of-the-art Stable Diffusion model, demonstrating the efficacy of our ethical image manipulation approach. Our human study further confirms that ours is indeed able to generate morally-satisfying images from immoral ones.

引用

页码：4663 / 4672

页数：10

共 50 条

[41] MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices
Zhao, Yang
Xu, Yanwu
Xiao, Zhisheng
Jia, Haolin
Hou, Tingbo
COMPUTER VISION - ECCV 2024, PT LXII, 2025, 15120 : 225 - 242
[42] Text-to-image generation combined with mutual information maximization
Mo J.
Xu K.
Lin L.
Ouyang N.
Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2019, 46 (05): : 180 - 188
[43] Training-Free Consistent Text-to-Image Generation
Tewel, Yoad
Kaduri, Omri
Gal, Rinon
Kasten, Yoni
Wolf, Lior
Chechik, Gal
Atzmon, Yuval
ACM TRANSACTIONS ON GRAPHICS, 2024, 43 (04):
[44] ITI- GEN: Inclusive Text-to-Image Generation
Zhang, Cheng
Chen, Xuanbai
Chai, Siqi
Wu, Chen Henry
Lagun, Dmitry
Beeler, Thabo
De la Torre, Fernando
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3946 - 3957
[45] Translation-Enhanced Multilingual Text-to-Image Generation
Li, Yaoyiran
Chang, Ching-Yun
Rawls, Stephen
Vulic, Ivan
Korhonen, Anna
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 9174 - 9193
[46] EmoGen: Emotional Image Content Generation with Text-to-Image Diffusion Models
Yang, Jingyuan
Feng, Jiawei
Huang, Hui
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 6358 - 6368
[47] Locally controllable network based on visual-linguistic relation alignment for text-to-image generation
Li, Zaike
Liu, Li
Zhang, Huaxiang
Liu, Dongmei
Song, Yu
Li, Boqun
MULTIMEDIA SYSTEMS, 2024, 30 (01)
[48] TEXT LOCALIZATION USING IMAGE CUES AND TEXT LINE INFORMATION
Toan Nguyen Dinh
Park, Jonghyun
Lee, Gueesang
2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, : 2261 - 2264
[49] Background Layout Generation and Object Knowledge Transfer for Text-to-Image Generation
Chen, Zhuowei
Mao, Zhendong
Fang, Shancheng
Hu, Bo
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4327 - 4335
[50] GreenStableYolo: Optimizing Inference Time and Image Quality of Text-to-Image Generation
Gong, Jingzhi
Li, Sisi
D'Aloisio, Giordano
Ding, Zishuo
Ye, Yulong
Langdon, William B.
Sarro, Federica
SEARCH-BASED SOFTWARE ENGINEERING, SSBSE 2024, 2024, 14767 : 70 - 76

← 1 2 3 4 5 →