Unraveling Downstream Gender Bias from Large Language Models: A Study on AI Educational Writing Assistance

被引:0
|
作者
Wambsganss, Thiemo [1 ]
Su, Xiaotian [2 ]
Swamy, Vinitra [2 ]
Neshaei, Seyed Parsa [2 ]
Rietsche, Roman [1 ]
Kaser, Tanja [2 ]
机构
[1] Bern Univ Appl Sci, Bern, Switzerland
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large Language Models (LLMs) are increasingly utilized in educational tasks such as providing writing suggestions to students. Despite their potential, LLMs are known to harbor inherent biases which may negatively impact learners. Previous studies have investigated bias in models and data representations separately, neglecting the potential impact of LLM bias on human writing. In this paper, we investigate how bias transfers through an AI writing support pipeline. We conduct a large-scale user study with 231 students writing business case peer reviews in German. Students are divided into five groups with different levels of writing support: one classroom group with featurebased suggestions and four groups recruited from Prolific - a control group with no assistance, two groups with suggestions from finetuned GPT-2 and GPT-3 models, and one group with suggestions from pre-trained GPT-3.5. Using GenBit gender bias analysis, Word Embedding Association Tests (WEAT), and Sentence Embedding Association Test (SEAT) we evaluate the gender bias at various stages of the pipeline: in model embeddings, in suggestions generated by the models, and in reviews written by students. Our results demonstrate that there is no significant difference in gender bias between the resulting peer reviews of groups with and without LLM suggestions. Our research is therefore optimistic about the use of AI writing support in the classroom, showcasing a context where bias in LLMs does not transfer to students' responses(1).
引用
收藏
页码:10275 / 10288
页数:14
相关论文
共 50 条
  • [31] The implementation of the cognitive theory of multimedia learning in the design and evaluation of an AI educational video assistant utilizing large language models
    Alshaikh, Rana
    Al-Malki, Norah
    Almasre, Maida
    HELIYON, 2024, 10 (03)
  • [32] Interacting with Large Language Models: A Case Study on AI-Aided Brainstorming for Guesstimation Problems
    Salikutluk, Vildan
    Koert, Dorothea
    Jaekel, Frank
    HHAI 2023: AUGMENTING HUMAN INTELLECT, 2023, 368 : 153 - 167
  • [33] Pilot study on large language models for risk-of-bias assessments in systematic reviews: A(I) new type of bias?
    Barsby, Joseph
    Hume, Samuel
    Lemmey, Hamish A. L.
    Cutteridge, Joseph
    Lee, Regent
    Bera, Katarzyna D.
    BMJ EVIDENCE-BASED MEDICINE, 2024,
  • [34] Unraveling media perspectives: a comprehensive methodology combining large language models, topic modeling, sentiment analysis, and ontology learning to analyse media bias
    Jaehde, Orlando
    Weber, Thorsten
    Buchkremer, Ruediger
    JOURNAL OF COMPUTATIONAL SOCIAL SCIENCE, 2025, 8 (02):
  • [35] Human-AI Synergy in Survey Development: Implications from Large Language Models in Business and Research
    Fan, Ke ping
    Chung, Ng ka
    ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS, 2025, 16 (01) : 24 - 39
  • [36] Aligning large language models with radiologists by reinforcement learning from AI feedback for chest CT reports
    Yang, Lingrui
    Zhou, Yuxing
    Qi, Jun
    Zhen, Xiantong
    Sun, Li
    Shi, Shan
    Su, Qinghua
    Yang, Xuedong
    EUROPEAN JOURNAL OF RADIOLOGY, 2025, 184
  • [37] Pilot Study on Using Large Language Models for Educational Resource Development in Japanese Radiological Technologist Exams
    Kondo, Tatsuya
    Okamoto, Masashi
    Kondo, Yohan
    MEDICAL SCIENCE EDUCATOR, 2025,
  • [38] Comparing the Efficacy of Large Language Models ChatGPT, BARD, and Bing AI in Providing Information on Rhinoplasty: An Observational Study
    Seth, Ishith
    Lim, Bryan
    Xie, Yi
    Cevik, Jevan
    Rozen, Warren M.
    Ross, Richard J.
    Lee, Mathew
    AESTHETIC SURGERY JOURNAL OPEN FORUM, 2023, 5
  • [39] Exploring infection clinicians' perceptions of bias in Large Language Models (LLMs) litre ChatGPT: A deep learning study
    Praveen, S. V.
    Vijaya, S.
    JOURNAL OF INFECTION, 2023, 87 (06) : 579 - 580
  • [40] From hype to evidence: exploring large language models for inter-group bias classification in higher education
    Albuquerque, Josmario
    Rienties, Bart
    Holmes, Wayne
    Hlosta, Martin
    INTERACTIVE LEARNING ENVIRONMENTS, 2024,