Inspecting the Geographical Representativeness of Images from Text-to-Image Models

被引:0
|
作者
Basu, Abhipsa [1 ]
Babu, R. Venkatesh [1 ]
Pruthi, Danish [2 ]
机构
[1] IISc Bangalore, Vis & AI Lab, Bengaluru, India
[2] IISc Bangalore, FLAIR Lab, Bengaluru, India
关键词
D O I
10.1109/ICCV51070.2023.00474
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent progress in generative models has resulted in models that produce both realistic as well as relevant images for most textual inputs. These models are being used to generate millions of images everyday, and hold the potential to drastically impact areas such as generative art, digital marketing and data augmentation. Given their outsized impact, it is important to ensure that the generated content reflects the artifacts and surroundings across the globe, rather than over-representing certain parts of the world. In this paper, we measure the geographical representativeness of common nouns (e.g., a house) generated through DALL center dot E 2 and Stable Diffusion models using a crowdsourced study comprising 540 participants across 27 countries. For deliberately underspecified inputs without country names, the generated images most reflect the surroundings of the United States followed by India, and the top generations rarely reflect surroundings from all other countries (average score less than 3 out of 5). Specifying the country names in the input increases the representativeness by 1.44 points on average on a 5 - point Likert scale for DALL center dot E 2 and 0.75 for Stable Diffusion, however, the overall scores for many countries still remain low, highlighting the need for future models to be more geographically inclusive. Lastly, we examine the feasibility of quantifying the geographical representativeness of generated images without conducting user studies.(1)
引用
收藏
页码:5113 / 5124
页数:12
相关论文
共 50 条
  • [1] Partiality and Misconception: Investigating Cultural Representativeness in Text-To-Image Models
    Zhang, Lili
    Liao, Xi
    Yang, Zaijia
    Gao, Baihang
    Wang, Chunjie
    Yang, Qiuling
    Li, Deshun
    [J]. PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024), 2024,
  • [2] Exposing fake images generated by text-to-image diffusion models
    Xu, Qiang
    Wang, Hao
    Meng, Laijin
    Mi, Zhongjie
    Yuan, Jianye
    Yan, Hong
    [J]. PATTERN RECOGNITION LETTERS, 2023, 176 : 76 - 82
  • [3] Exposing fake images generated by text-to-image diffusion models
    Xu, Qiang
    Wang, Hao
    Meng, Laijin
    Mi, Zhongjie
    Yuan, Jianye
    Yan, Hong
    [J]. PATTERN RECOGNITION LETTERS, 2023, 176 : 76 - 82
  • [4] Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
    Qu, Yiting
    Shen, Xinyue
    He, Xinlei
    Backes, Michael
    Zannettou, Savvas
    Zhang, Yang
    [J]. PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 3403 - 3417
  • [5] Holistic Evaluation of Text-to-Image Models
    Lee, Tony
    Yasunaga, Michihiro
    Meng, Chenlin
    Mai, Yifan
    Park, Joon Sung
    Gupta, Agrim
    Zhang, Yunzhi
    Narayanan, Deepak
    Teufel, Hannah Benita
    Bellagente, Marco
    Kang, Minguk
    Park, Taesung
    Leskovec, Jure
    Zhu, Jun-Yan
    Li Fei-Fei
    Wu, Jiajun
    Ermon, Stefano
    Liang, Percy
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [6] StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners
    Tian, Yonglong
    Fan, Lijie
    Isola, Phillip
    Chang, Huiwen
    Krishnan, Dilip
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [7] Evaluating Data Attribution for Text-to-Image Models
    Wang, Sheng-Yu
    Efros, Alexei A.
    Zhu, Jun-Yan
    Zhang, Richard
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 7158 - 7169
  • [8] Multilingual Conceptual Coverage in Text-to-Image Models
    Saxon, Michael
    Wang, William Yang
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 4831 - 4848
  • [9] Ablating Concepts in Text-to-Image Diffusion Models
    Kumari, Nupur
    Zhang, Bingliang
    Wang, Sheng-Yu
    Shechtman, Eli
    Zhang, Richard
    Zhu, Jun-Yan
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22634 - 22645
  • [10] Resolving Ambiguities in Text-to-Image Generative Models
    Mehrabi, Ninareh
    Goyal, Palash
    Verma, Apurv
    Dhamala, Jwala
    Kumar, Varun
    Hu, Qian
    Chang, Kai-Wei
    Zemel, Richard
    Galstyan, Aram
    Gupta, Rahul
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 14367 - 14388