Gender-preserving Debiasing for Pre-trained Word Embeddings

被引：0

作者：

Kaneko, Masahiro ^{[1
]}

Bollegala, Danushka ^{[2
]}

机构：

[1] Tokyo Metropolitan Univ, Hachioji, Tokyo, Japan

[2] Univ Liverpool, Liverpool, Merseyside, England

来源：

57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019) | 2019年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Word embeddings learnt from massive text collections have demonstrated significant levels of discriminative biases such as gender, racial or ethnic biases, which in turn bias the down-stream NLP applications that use those word embeddings. Taking gender-bias as a working example, we propose a debiasing method that preserves non-discriminative gender-related information, while removing stereotypical discriminative gender biases from pre-trained word embeddings. Specifically, we consider four types of information: feminine, masculine, gender-neutral and stereotypical, which represent the relationship between gender vs. bias, and propose a debiasing method that (a) preserves the gender-related information in feminine and masculine words, (b) preserves the neutrality in genderneutral words, and (c) removes the biases from stereotypical words. Experimental results on several previously proposed benchmark datasets show that our proposed method can debias pre-trained word embeddings better than existing SoTA methods proposed for debiasing word embeddings while preserving gender-related but non-discriminative information.

引用

页码：1641 / 1650

页数：10

共 50 条

[1] Dictionary-based Debiasing of Pre-trained Word Embeddings
Kaneko, Masahiro
Bollegala, Danushka
[J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 212 - 223
[2] Debiasing Pre-trained Contextualised Embeddings
Kaneko, Masahiro
Bollegala, Danushka
[J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1256 - 1266
[3] Leveraging Pre-trained Language Models for Gender Debiasing
Jain, Nishtha
Popovic, Maja
Groves, Declan
Specia, Lucia
[J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 2188 - 2195
[4] The impact of using pre-trained word embeddings in Sinhala chatbots
Gamage, Bimsara
Pushpananda, Randil
Weerasinghe, Ruvan
[J]. 2020 20TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER-2020), 2020, : 161 - 165
[5] Disambiguating Clinical Abbreviations using Pre-trained Word Embeddings
Jaber, Areej
Martinez, Paloma
[J]. HEALTHINF: PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES - VOL. 5: HEALTHINF, 2021, : 501 - 508
[6] Sentiment analysis based on improved pre-trained word embeddings
Rezaeinia, Seyed Mahdi
Rahmani, Rouhollah
Ghodsi, Ali
Veisi, Hadi
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 117 : 139 - 147
[7] Embodying Pre-Trained Word Embeddings Through Robot Actions
Toyoda, Minori
Suzuki, Kanata
Mori, Hiroki
Hayashi, Yoshihiko
Ogata, Tetsuya
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02) : 4225 - 4232
[8] Automated Employee Objective Matching Using Pre-trained Word Embeddings
Ghanem, Mohab
Elnaggar, Ahmed
Mckinnon, Adam
Debes, Christian
Boisard, Olivier
Matthes, Florian
[J]. 2021 IEEE 25TH INTERNATIONAL ENTERPRISE DISTRIBUTED OBJECT COMPUTING CONFERENCE (EDOC 2021), 2021, : 51 - 60
[9] Investigating the Impact of Pre-trained Word Embeddings on Memorization in Neural Networks
Thomas, Aleena
Adelani, David Ifeoluwa
Davody, Ali
Mogadala, Aditya
Klakow, Dietrich
[J]. TEXT, SPEECH, AND DIALOGUE (TSD 2020), 2020, 12284 : 273 - 281
[10] A Comparative Study of Pre-trained Word Embeddings for Arabic Sentiment Analysis
Zouidine, Mohamed
Khalil, Mohammed
[J]. 2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 1243 - 1248

← 1 2 3 4 5 →