Inspecting the Geographical Representativeness of Images from Text-to-Image Models

被引:0
|
作者
Basu, Abhipsa [1 ]
Babu, R. Venkatesh [1 ]
Pruthi, Danish [2 ]
机构
[1] IISc Bangalore, Vis & AI Lab, Bengaluru, India
[2] IISc Bangalore, FLAIR Lab, Bengaluru, India
关键词
D O I
10.1109/ICCV51070.2023.00474
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent progress in generative models has resulted in models that produce both realistic as well as relevant images for most textual inputs. These models are being used to generate millions of images everyday, and hold the potential to drastically impact areas such as generative art, digital marketing and data augmentation. Given their outsized impact, it is important to ensure that the generated content reflects the artifacts and surroundings across the globe, rather than over-representing certain parts of the world. In this paper, we measure the geographical representativeness of common nouns (e.g., a house) generated through DALL center dot E 2 and Stable Diffusion models using a crowdsourced study comprising 540 participants across 27 countries. For deliberately underspecified inputs without country names, the generated images most reflect the surroundings of the United States followed by India, and the top generations rarely reflect surroundings from all other countries (average score less than 3 out of 5). Specifying the country names in the input increases the representativeness by 1.44 points on average on a 5 - point Likert scale for DALL center dot E 2 and 0.75 for Stable Diffusion, however, the overall scores for many countries still remain low, highlighting the need for future models to be more geographically inclusive. Lastly, we examine the feasibility of quantifying the geographical representativeness of generated images without conducting user studies.(1)
引用
收藏
页码:5113 / 5124
页数:12
相关论文
共 50 条
  • [41] Controllable Text-to-Image Generation
    Li, Bowen
    Qi, Xiaojuan
    Lukasiewicz, Thomas
    Torr, Philip H. S.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [42] Unsupervised text-to-image synthesis
    Dong, Yanlong
    Zhang, Ying
    Ma, Lin
    Wang, Zhi
    Luo, Jiebo
    [J]. PATTERN RECOGNITION, 2021, 110
  • [43] Unsupervised text-to-image synthesis
    Dong, Yanlong
    Zhang, Ying
    Ma, Lin
    Wang, Zhi
    Luo, Jiebo
    [J]. Pattern Recognition, 2021, 110
  • [44] Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models
    Lee, Jaewoong
    Jang, Sangwon
    Jo, Jaehyeong
    Yoon, Jaehong
    Kim, Yunji
    Kim, Jin-Hwa
    Ha, Jung-Woo
    Hwang, Sung Ju
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 23195 - 23205
  • [45] Expressive Text-to-Image Generation with Rich Text
    Ge, Songwei
    Park, Taesung
    Zhu, Jun-Yan
    Huang, Jia-Bin
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 7511 - 7522
  • [46] PromptMix: Text-to-image diffusion models enhance the performance of lightweight networks
    Bakhtiarnia, Arian
    Zhang, Qi
    Iosifidis, Alexandros
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [47] From External to Internal: Structuring Image for Text-to-Image Attributes Manipulation
    Gao, Lianli
    Zhao, Qike
    Zhu, Junchen
    Su, Sitong
    Cheng, Lechao
    Zhao, Lei
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 7248 - 7261
  • [48] Comparative Review of Text-to-Image Generation Techniques Based on Diffusion Models
    Gao, Xinyu
    Du, Fang
    Song, Lijuan
    [J]. Computer Engineering and Applications, 2024, 60 (24) : 44 - 64
  • [49] Point-Cloud Completion with Pretrained Text-to-image Diffusion Models
    Kasten, Yoni
    Rahamim, Ohad
    Chechik, Gal
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [50] MagicFusion: Boosting Text-to-Image Generation Performance by Fusing Diffusion Models
    Zhao, Jing
    Zheng, Heliang
    Wang, Chaoyue
    Lan, Long
    Yang, Wenjing
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22535 - 22545