MarIA and BETO are sexist: evaluating gender bias in large language models for Spanish

被引：3

作者：

Garrido-Munoz, Ismael ^{[1
]}

Martinez-Santiago, Fernando ^{[1
]}

Montejo-Raez, Arturo ^{[1
]}

机构：

[1] Univ Jaen, CEATIC, Campus Las Lagunillas, Jaen 23071, Spain

来源：

LANGUAGE RESOURCES AND EVALUATION | 2024年 / 58卷 / 04期

关键词：

Deep learning; Gender bias; Bias evaluation; Language model; BERT; RoBERTa;

D O I：

10.1007/s10579-023-09670-3

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The study of bias in language models is a growing area of work, however, both research and resources are focused on English. In this paper, we make a first approach focusing on gender bias in some freely available Spanish language models trained using popular deep neural networks, like BERT or RoBERTa. Some of these models are known for achieving state-of-the-art results on downstream tasks. These promising results have promoted such models' integration in many real-world applications and production environments, which could be detrimental to people affected for those systems. This work proposes an evaluation framework to identify gender bias in masked language models, with explainability in mind to ease the interpretation of the evaluation results. We have evaluated 20 different models for Spanish, including some of the most popular pretrained ones in the research community. Our findings state that varying levels of gender bias are present across these models.This approach compares the adjectives proposed by the model for a set of templates. We classify the given adjectives into understandable categories and compute two new metrics from model predictions, one based on the internal state (probability) and the other one on the external state (rank). Those metrics are used to reveal biased models according to the given categories and quantify the degree of bias of the models under study.

引用

页码：1387 / 1417

页数：31

共 50 条

[41] Evaluating Large Language Models on Controlled Generation Tasks
Sun, Jiao
Tian, Yufei
Zhou, Wangchunshu
Xu, Nan
Hu, Qian
Gupta, Rahul
Wieting, John
Peng, Nanyun
Ma, Xuezhe
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 3155 - 3168
[42] Baby steps in evaluating the capacities of large language models
Michael C. Frank
Nature Reviews Psychology, 2023, 2 : 451 - 452
[43] EconNLI: Evaluating Large Language Models on Economics Reasoning
Guo, Yue
Yang, Yi
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 982 - 994
[44] Evaluating Large Language Models for Tax Law Reasoning
Cavalcante Presa, Joao Paulo
Camilo Junior, Celso Goncalves
Teles de Oliveira, Savio Salvarino
INTELLIGENT SYSTEMS, BRACIS 2024, PT I, 2025, 15412 : 460 - 474
[45] A Chinese Dataset for Evaluating the Safeguards in Large Language Models
Wang, Yuxia
Zhai, Zenan
Li, Haonan
Han, Xudong
Lin, Lizhi
Zhang, Zhenxuan
Zhao, Jingru
Nakov, Preslav
Baldwin, Timothy
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 3106 - 3119
[46] Evaluating large language models in analysing classroom dialogue
Long, Yun
Luo, Haifeng
Zhang, Yu
NPJ SCIENCE OF LEARNING, 2024, 9 (01)
[47] Evaluating large language models in theory of mind tasks
Kosinski, Michal
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2024, 121 (45)
[48] DebugBench: Evaluating Debugging Capability of Large Language Models
Tian, Runchu
Ye, Yining
Qin, Yujia
Cong, Xin
Lin, Yankai
Pan, Yinxu
Wu, Yesai
Hui, Haotian
Liu, Weichuan
Liu, Zhiyuan
Sun, Maosong
Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2024, : 4173 - 4198
[49] Evaluating Nuanced Bias in Large Language Model Free Response Answers
Healey, Jennifer
Byrum, Laurie
Akhtar, Md Nadeem
Sinha, Moumita
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT II, NLDB 2024, 2024, 14763 : 378 - 391
[50] A survey on multilingual large language models: corpora, alignment, and bias
Xu, Yuemei
Hu, Ling
Zhao, Jiayi
Qiu, Zihan
Xu, Kexin
Ye, Yuqi
Gu, Hanwen
FRONTIERS OF COMPUTER SCIENCE, 2025, 19 (11)

← 1 2 3 4 5 →