Localization and Manipulation of Immoral Visual Cues for Safe Text-to-Image Generation

被引:0
|
作者
Park, Seongbeom [1 ]
Moon, Suhong [2 ]
Park, Seunghyun [3 ]
Kim, Jinkyu [1 ]
机构
[1] Korea Univ, CSE, Seoul, South Korea
[2] Univ Calif Berkeley, EECS, Berkeley, CA USA
[3] NAVER Cloud AI, Seoul, South Korea
来源
2024 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION, WACV 2024 | 2024年
关键词
D O I
10.1109/WACV57701.2024.00461
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Current text-to-image generation methods produce high-resolution and high-quality images, but they should not produce immoral images that may contain inappropriate content from the perspective of commonsense morality. Conventional approaches, however, often neglect these ethical concerns, and existing solutions are often limited to ensure moral compatibility. To address this, we propose a novel method that has three main capabilities: (1) our model recognizes the degree of visual commonsense immorality of a given generated image, (2) our model localizes immoral visual (and textual) attributes that make the image visually immoral, and (3) our model manipulates such immoral visual cues into a morally-qualifying alternative. We conduct experiments with various text-to-image generation models, including the state-of-the-art Stable Diffusion model, demonstrating the efficacy of our ethical image manipulation approach. Our human study further confirms that ours is indeed able to generate morally-satisfying images from immoral ones.
引用
收藏
页码:4663 / 4672
页数:10
相关论文
共 50 条
  • [41] MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices
    Zhao, Yang
    Xu, Yanwu
    Xiao, Zhisheng
    Jia, Haolin
    Hou, Tingbo
    COMPUTER VISION - ECCV 2024, PT LXII, 2025, 15120 : 225 - 242
  • [42] Text-to-image generation combined with mutual information maximization
    Mo J.
    Xu K.
    Lin L.
    Ouyang N.
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2019, 46 (05): : 180 - 188
  • [43] Training-Free Consistent Text-to-Image Generation
    Tewel, Yoad
    Kaduri, Omri
    Gal, Rinon
    Kasten, Yoni
    Wolf, Lior
    Chechik, Gal
    Atzmon, Yuval
    ACM TRANSACTIONS ON GRAPHICS, 2024, 43 (04):
  • [44] ITI- GEN: Inclusive Text-to-Image Generation
    Zhang, Cheng
    Chen, Xuanbai
    Chai, Siqi
    Wu, Chen Henry
    Lagun, Dmitry
    Beeler, Thabo
    De la Torre, Fernando
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3946 - 3957
  • [45] Translation-Enhanced Multilingual Text-to-Image Generation
    Li, Yaoyiran
    Chang, Ching-Yun
    Rawls, Stephen
    Vulic, Ivan
    Korhonen, Anna
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 9174 - 9193
  • [46] EmoGen: Emotional Image Content Generation with Text-to-Image Diffusion Models
    Yang, Jingyuan
    Feng, Jiawei
    Huang, Hui
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 6358 - 6368
  • [47] Locally controllable network based on visual-linguistic relation alignment for text-to-image generation
    Li, Zaike
    Liu, Li
    Zhang, Huaxiang
    Liu, Dongmei
    Song, Yu
    Li, Boqun
    MULTIMEDIA SYSTEMS, 2024, 30 (01)
  • [48] TEXT LOCALIZATION USING IMAGE CUES AND TEXT LINE INFORMATION
    Toan Nguyen Dinh
    Park, Jonghyun
    Lee, Gueesang
    2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, : 2261 - 2264
  • [49] Background Layout Generation and Object Knowledge Transfer for Text-to-Image Generation
    Chen, Zhuowei
    Mao, Zhendong
    Fang, Shancheng
    Hu, Bo
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4327 - 4335
  • [50] GreenStableYolo: Optimizing Inference Time and Image Quality of Text-to-Image Generation
    Gong, Jingzhi
    Li, Sisi
    D'Aloisio, Giordano
    Ding, Zishuo
    Ye, Yulong
    Langdon, William B.
    Sarro, Federica
    SEARCH-BASED SOFTWARE ENGINEERING, SSBSE 2024, 2024, 14767 : 70 - 76