ViSAGe: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image Generation

被引:0
|
作者
Jha, Akshita [1 ,2 ]
Prabhakaran, Vinodkumar [2 ]
Denton, Remi [2 ]
Laszlo, Sarah [2 ]
Dave, Shachi [2 ]
Qadri, Rida [2 ]
Reddy, Chandan K. [1 ]
Dev, Sunipa [2 ]
机构
[1] Virginia Tech, Blacksburg, VA 24061 USA
[2] Google Res, Mountain View, CA USA
来源
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS | 2024年
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Recent studies have shown that Text-toImage (T2I) model generations can reflect social stereotypes present in the real world. However, existing approaches for evaluating stereotypes have a noticeable lack of coverage of global identity groups and their associated stereotypes. To address this gap, we introduce the ViSAGe (Visual Stereotypes Around the Globe) dataset to enable evaluation of known nationality-based stereotypes in T2I models, across 135 nationalities. We enrich an existing textual stereotype resource by distinguishing between stereotypical associations that are more likely to have visual depictions, such as 'sombrero', from those that are less visually concrete, such as 'attractive'. We demonstrate ViSAGe's utility through a multi-faceted evaluation of T2I generations. First, we show that stereotypical attributes in ViSAGe are thrice as likely to be present in generated images of corresponding identities as compared to other attributes, and that the offensiveness of these depictions is especially higher for identities from Africa, South America, and South East Asia. Second, we assess the stereotypical pull of visual depictions of identity groups, which reveals how the 'default' representations of all identity groups in ViSAGe have a pull towards stereotypical depictions, and that this pull is even more prominent for identity groups from the Global South. CONTENT WARNING: Some examples contain offensive stereotypes.
引用
收藏
页码:12333 / 12347
页数:15
相关论文
共 50 条
  • [31] Variational Distribution Learning for Unsupervised Text-to-Image Generation
    Kang, Minsoo
    Lee, Doyup
    Kim, Jiseob
    Kim, Saehoon
    Han, Bohyung
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23380 - 23389
  • [32] HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances
    Narasimhaswamy, Supreeth
    Bhattacharya, Uttaran
    Chen, Xiang
    Dasgupta, Ishita
    Mitra, Saayan
    Hoai, Minh
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 2468 - 2479
  • [33] Attribute-Centric Compositional Text-to-Image Generation
    Cong, Yuren
    Min, Martin Renqiang
    Li, Li Erran
    Rosenhahn, Bodo
    Yang, Michael Ying
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025,
  • [34] Latent Guard: A Safety Framework for Text-to-Image Generation
    Liu, Runtao
    Khakzar, Ashkan
    Gu, Jindong
    Chen, Qifeng
    Torr, Philip
    Pizzati, Fabio
    COMPUTER VISION - ECCV 2024, PT XXVI, 2025, 15084 : 93 - 109
  • [35] Improving text-to-image generation with object layout guidance
    Zakraoui, Jezia
    Saleh, Moutaz
    Al-Maadeed, Somaya
    Jaam, Jihad Mohammed
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (18) : 27423 - 27443
  • [36] Using text-to-image generation for architectural design ideation
    Paananen, Ville
    Oppenlaender, Jonas
    Visuri, Aku
    INTERNATIONAL JOURNAL OF ARCHITECTURAL COMPUTING, 2024, 22 (03) : 458 - 474
  • [37] CogView: Mastering Text-to-Image Generation via Transformers
    Ding, Ming
    Yang, Zhuoyi
    Hong, Wenyi
    Zheng, Wendi
    Zhou, Chang
    Yin, Da
    Lin, Junyang
    Zou, Xu
    Shao, Zhou
    Yang, Hongxia
    Tang, Jie
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [38] No-reference Quality Assessment of Text-to-Image Generation
    Huang, Haitao
    Jia, Rongli
    Zhang, Yuhong
    Xie, Rong
    Song, Li
    Li, Lin
    Feng, Yanan
    19TH IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING, BMSB 2024, 2024, : 357 - 362
  • [39] Social Biases through the Text-to-Image Generation Lens
    Naik, Ranjita
    Nushi, Besmira
    PROCEEDINGS OF THE 2023 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2023, 2023, : 786 - 808
  • [40] HARIVO: Harnessing Text-to-Image Models for Video Generation
    Kwon, Mingi
    Oh, Seoung Wug
    Zhou, Yang
    Liu, Difan
    Lee, Joon-Young
    Cai, Haoran
    Liu, Baqiao
    Liu, Feng
    Uh, Youngjung
    COMPUTER VISION - ECCV 2024, PT LIII, 2025, 15111 : 19 - 36