Unraveling Downstream Gender Bias from Large Language Models: A Study on AI Educational Writing Assistance

被引:0
|
作者
Wambsganss, Thiemo [1 ]
Su, Xiaotian [2 ]
Swamy, Vinitra [2 ]
Neshaei, Seyed Parsa [2 ]
Rietsche, Roman [1 ]
Kaser, Tanja [2 ]
机构
[1] Bern Univ Appl Sci, Bern, Switzerland
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large Language Models (LLMs) are increasingly utilized in educational tasks such as providing writing suggestions to students. Despite their potential, LLMs are known to harbor inherent biases which may negatively impact learners. Previous studies have investigated bias in models and data representations separately, neglecting the potential impact of LLM bias on human writing. In this paper, we investigate how bias transfers through an AI writing support pipeline. We conduct a large-scale user study with 231 students writing business case peer reviews in German. Students are divided into five groups with different levels of writing support: one classroom group with featurebased suggestions and four groups recruited from Prolific - a control group with no assistance, two groups with suggestions from finetuned GPT-2 and GPT-3 models, and one group with suggestions from pre-trained GPT-3.5. Using GenBit gender bias analysis, Word Embedding Association Tests (WEAT), and Sentence Embedding Association Test (SEAT) we evaluate the gender bias at various stages of the pipeline: in model embeddings, in suggestions generated by the models, and in reviews written by students. Our results demonstrate that there is no significant difference in gender bias between the resulting peer reviews of groups with and without LLM suggestions. Our research is therefore optimistic about the use of AI writing support in the classroom, showcasing a context where bias in LLMs does not transfer to students' responses(1).
引用
收藏
页码:10275 / 10288
页数:14
相关论文
共 50 条
  • [1] LaMPost: AI Writing Assistance for Adults with Dyslexia Using Large Language Models
    Goodman, Steven M.
    Buehler, Erin
    Clary, Patrick
    Coenen, Andy
    Donsbach, Aaron
    Horne, Tiffanie N.
    Lahav, Michal
    Macdonald, Robert
    Michaels, Rain Breaw
    Narayanan, Ajit
    Pushkarna, Mahima
    Riley, Joel
    Santana, Alex
    Shi, Lei
    Sweeney, Rachel
    Weaver, Phil
    Yuan, Ann
    Morris, Meredith Ringel
    COMMUNICATIONS OF THE ACM, 2024, 67 (09)
  • [2] LaMPost: To view the accompanying AI Writing Assistance for Adults with Dyslexia Using Large Language Models
    Goodman, Steven M.
    Buehler, Erin
    Clary, Patrick
    Coenen, Andy
    Donsbach, Aaron
    Horne, Tiffanie N.
    Lahav, Michal
    MacDonald, Robert
    Michaels, Rain Breaw
    Narayanan, Ajit
    Pushkarna, Mahima
    Riley, Joel
    Santana, Alex
    Shi, Lei
    Sweeney, Rachel
    Weaver, Phil
    Yuan, Ann
    Morris, Meredith Ringel
    COMMUNICATIONS OF THE ACM, 2024, 67 (09) : 80 - 89
  • [3] Gender bias and stereotypes in Large Language Models
    Kotek, Hadas
    Dockum, Rikker
    Sun, David Q.
    PROCEEDINGS OF THE ACM COLLECTIVE INTELLIGENCE CONFERENCE, CI 2023, 2023, : 12 - 24
  • [4] Locating and Mitigating Gender Bias in Large Language Models
    Cai, Yuchen
    Cao, Ding
    Guo, Rongxi
    Wen, Yaqin
    Liu, Guiquan
    Chen, Enhong
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT IV, ICIC 2024, 2024, 14878 : 471 - 482
  • [5] Evaluating and Mitigating Gender Bias in Generative Large Language Models
    Zhou, H.
    Inkpen, D.
    Kantarci, B.
    INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2024, 19 (06)
  • [6] Human bias in AI models? Anchoring effects and mitigation strategies in large language models
    Nguyen, Jeremy K.
    JOURNAL OF BEHAVIORAL AND EXPERIMENTAL FINANCE, 2024, 43
  • [7] Communicating the cultural other: trust and bias in generative AI and large language models
    Jenks, Christopher J.
    APPLIED LINGUISTICS REVIEW, 2025, 16 (02) : 787 - 795
  • [8] Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models
    Abdullahi, Tassallah
    Singh, Ritambhara
    Eickhoff, Carsten
    JMIR MEDICAL EDUCATION, 2024, 10
  • [9] Large Language Models as AI-Powered Educational Assistants: Comparing GPT-4 and Gemini for Writing Teaching Cases
    Lang, Guido
    Triantoro, Tamilla
    Sharp, Jason H.
    Journal of Information Systems Education, 35 (03): : 390 - 407
  • [10] Clinical Research With Large Language Models Generated Writing-Clinical Research with AI-assisted Writing (CRAW) Study
    Huespe, Ivan A.
    Echeverri, Jorge
    Khalid, Aisha
    Bisso, Indalecio Carboni
    Musso, Carlos G.
    Surani, Salim
    Bansal, Vikas
    Kashyap, Rahul
    CRITICAL CARE EXPLORATIONS, 2023, 5 (10) : E0975