Image Representations Learned With Unsupervised Pre-Training Contain Human-like Biases

被引:62
|
作者
Steed, Ryan [1 ]
Caliskan, Aylin [2 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] George Washington Univ, Washington, DC USA
关键词
implicit bias; unsupervised learning; computer vision; IMPLICIT ASSOCIATION TEST; ATTITUDES; BELIEFS;
D O I
10.1145/3442188.3445932
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Recent advances in machine learning leverage massive datasets of unlabeled images from the web to learn general-purpose image representations for tasks from image classification to face recognition. But do unsupervised computer vision models automatically learn implicit patterns and embed social biases that could have harmful downstream effects? We develop a novel method for quantifying biased associations between representations of social concepts and attributes in images. We find that state-of-the-art unsupervised models trained on ImageNet, a popular benchmark image dataset curated from internet images, automatically learn racial, gender, and intersectional biases. We replicate 8 previously documented human biases from social psychology, from the innocuous, as with insects and flowers, to the potentially harmful, as with race and gender. Our results closely match three hypotheses about intersectional bias from social psychology. For the first time in unsupervised computer vision, we also quantify implicit human biases about weight, disabilities, and several ethnicities. When compared with statistical patterns in online image datasets, our findings suggest that machine learning models can automatically learn bias from the way people are stereotypically portrayed on the web.
引用
收藏
页码:701 / 713
页数:13
相关论文
共 50 条
  • [1] Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases
    Guo, Wei
    Caliskan, Aylin
    [J]. AIES '21: PROCEEDINGS OF THE 2021 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2021, : 122 - 133
  • [2] Large pre-trained language models contain human-like biases of what is right and wrong to do
    Patrick Schramowski
    Cigdem Turan
    Nico Andersen
    Constantin A. Rothkopf
    Kristian Kersting
    [J]. Nature Machine Intelligence, 2022, 4 : 258 - 268
  • [3] Large pre-trained language models contain human-like biases of what is right and wrong to do
    Schramowski, Patrick
    Turan, Cigdem
    Andersen, Nico
    Rothkopf, Constantin A.
    Kersting, Kristian
    [J]. NATURE MACHINE INTELLIGENCE, 2022, 4 (03) : 258 - +
  • [4] Semantics derived automatically from language corpora contain human-like biases
    Caliskan, Aylin
    Bryson, Joanna J.
    Narayanan, Arvind
    [J]. SCIENCE, 2017, 356 (6334) : 183 - 186
  • [5] Unsupervised Pre-Training for Detection Transformers
    Dai, Zhigang
    Cai, Bolun
    Lin, Yugeng
    Chen, Junying
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 12772 - 12782
  • [6] Unsupervised Pre-Training of Image Features on Non-Curated Data
    Caron, Mathilde
    Bojanowski, Piotr
    Mairal, Julien
    Joulin, Armand
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2959 - 2968
  • [7] Unsupervised Pre-Training for Voice Activation
    Kolesau, Aliaksei
    Sesok, Dmitrij
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (23): : 1 - 13
  • [8] PRE-TRAINING WITH FRACTAL IMAGES FACILITATES LEARNED IMAGE QUALITY ESTIMATION
    Silbernagel, Malte
    Wiegand, Thomas
    Eisert, Peter
    Bosse, Sebastian
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2625 - 2629
  • [9] Unsupervised Pre-training Across Image Domains Improves Lung Tissue Classification
    Schlegl, Thomas
    Ofner, Joachim
    Langs, Georg
    [J]. MEDICAL COMPUTER VISION: ALGORITHMS FOR BIG DATA, 2014, 8848 : 82 - 93
  • [10] Improving Image Representations via MoCo Pre-training for Multimodal CXR Classification
    Serra, Francesco Dalla
    Jacenkow, Grzegorz
    Deligianni, Fani
    Dalton, Jeff
    O'Neil, Alison Q.
    [J]. MEDICAL IMAGE UNDERSTANDING AND ANALYSIS, MIUA 2022, 2022, 13413 : 623 - 635