Persistent Anti-Muslim Bias in Large Language Models

被引:135
|
作者
Abid, Abubakar [1 ]
Farooqi, Maheen [2 ]
Zou, James [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] McMaster Univ, Hamilton, ON, Canada
关键词
machine learning; language models; bias; stereotypes; ethics;
D O I
10.1145/3461702.3462624
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It has been observed that large-scale language models capture undesirable societal biases, e.g. relating to race and gender; yet religious bias has been relatively unexplored. We demonstrate that GPT-3, a state-of-the-art contextual language model, captures persistent Muslim-violence bias. We probe GPT-3 in various ways, including prompt completion, analogical reasoning, and story generation, to understand this anti-Muslim bias, demonstrating that it appears consistently and creatively in different uses of the model and that it is severe even compared to biases about other religious groups. For instance, "Muslim" is analogized to "terrorist" in 23% of test cases, while "Jewish" is mapped to its most common stereotype, "money," in 5% of test cases. We quantify the positive distraction needed to overcome this bias with adversarial text prompts, and find that use of the most positive 6 adjectives reduces violent completions for "Muslims" from 66% to 20%, but which is still higher than for other religious groups.
引用
收藏
页码:298 / 306
页数:9
相关论文
共 50 条
  • [41] The Difficulties of Italian Muslim Political Mobilization: Anti-Muslim Sentiment and Internal Fragmentation
    Pupcenoks, Juris
    JOURNAL OF MUSLIM MINORITY AFFAIRS, 2021, 41 (02) : 233 - 249
  • [42] A sociological comparison of anti-Semitism and anti-Muslim sentiment in Britain
    Meer, Nasar
    Noorani, Tehseen
    SOCIOLOGICAL REVIEW, 2008, 56 (02): : 195 - 219
  • [43] Racial, Religious, and Civic Dimensions of Anti-Muslim Sentiment in America
    Gerteis, Joseph
    Hartmann, Douglas
    Edgell, Penny
    SOCIAL PROBLEMS, 2020, 67 (04) : 719 - 740
  • [44] Bias and Fairness in Large Language Models: A Survey
    Gallegos, Isabel O.
    Rossi, Ryan A.
    Barrow, Joe
    Tanjim, Md Mehrab
    Kim, Sungchul
    Dernoncourt, Franck
    Yu, Tong
    Zhang, Ruiyi
    Ahmed, Nesreen K.
    COMPUTATIONAL LINGUISTICS, 2024, 50 (03) : 1097 - 1179
  • [45] Assessing political bias in large language models
    Rettenberger, Luca
    Reischl, Markus
    Schutera, Mark
    JOURNAL OF COMPUTATIONAL SOCIAL SCIENCE, 2025, 8 (02):
  • [46] Gender bias and stereotypes in Large Language Models
    Kotek, Hadas
    Dockum, Rikker
    Sun, David Q.
    PROCEEDINGS OF THE ACM COLLECTIVE INTELLIGENCE CONFERENCE, CI 2023, 2023, : 12 - 24
  • [47] A Clash of Racializations: The Policing of 'Race' and of Anti-Muslim Racism in Ireland
    Carr, James
    Haynes, Amanda
    CRITICAL SOCIOLOGY, 2015, 41 (01) : 21 - 40
  • [48] Comparing levels of anti-Muslim attitudes across Western countries
    Michael Savelkoul
    Peer Scheepers
    William van der Veld
    Louk Hagendoorn
    Quality & Quantity, 2012, 46 : 1617 - 1624
  • [49] Islamophobia and Crime - Anti-Muslim Demonising and Racialised Targeting Introduction
    Poynting, Scott
    INTERNATIONAL JOURNAL FOR CRIME JUSTICE AND SOCIAL DEMOCRACY, 2015, 4 (03) : 1 - 3
  • [50] Pogrom in Gujarat: Hindu nationalism and anti-Muslim violence in India
    Sabhlok, Anu
    GENDER PLACE AND CULTURE, 2013, 20 (06): : 829 - 832