Neutralizing Gender Bias in Word Embeddings with Latent Disentanglement and Counterfactual Generation

被引:0
|
作者
Shin, Seungjae [1 ]
Song, Kyungwoo [1 ]
Jang, JoonHo [1 ]
Kim, Hyemi [1 ]
Joo, Weonyoung [1 ]
Moon, Il-Chul [1 ]
机构
[1] Korea Adv Inst Sci & Technol KAIST, Daejeon, South Korea
基金
新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent research demonstrates that word embeddings, trained on the human-generated corpus, have strong gender biases in embedding spaces, and these biases can result in the discriminative results from the various downstream tasks. Whereas the previous methods project word embeddings into a linear subspace for debiasing, we introduce a Latent Disentanglement method with a siamese auto-encoder structure with an adapted gradient reversal layer. Our structure enables the separation of the semantic latent information and gender latent information of given word into the disjoint latent dimensions. Afterwards, we introduce a Counterfactual Generation to convert the gender information of words, so the original and the modified embeddings can produce a gender-neutralized word embedding after geometric alignment regularization, without loss of semantic information. From the various quantitative and qualitative debiasing experiments, our method shows to be better than existing debiasing methods in debiasing word embeddings. In addition, Our method shows the ability to preserve semantic information during debiasing by minimizing the semantic information losses for extrinsic NLP downstream tasks.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Gender Bias in Contextualized Word Embeddings
    Zhao, Jieyu
    Wangt, Tianlu
    Yatskart, Mark
    Cotterell, Ryan
    Ordonezt, Vicente
    Chang, Kai-Wei
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 629 - 634
  • [2] Investigation of Gender Bias in Turkish Word Embeddings
    Sevim, Nurullah
    Koc, Aykut
    [J]. 29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [3] Measuring Gender Bias in Word Embeddings across Domains and Discovering New Gender Bias Word Categories
    Chaloner, Kaytlin
    Maldonado, Alfredo
    [J]. GENDER BIAS IN NATURAL LANGUAGE PROCESSING (GEBNLP 2019), 2019, : 25 - 32
  • [4] Evaluating the Underlying Gender Bias in Contextualized Word Embeddings
    Basta, Christine
    Costa-jussa, Marta R.
    Casas, Noe
    [J]. GENDER BIAS IN NATURAL LANGUAGE PROCESSING (GEBNLP 2019), 2019, : 33 - 39
  • [5] Counterfactual Explanation for Regression via Disentanglement in Latent Space
    Zhao, Xuan
    Broelemann, Klaus
    Kasneci, Gjergji
    [J]. 2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 976 - 984
  • [6] Extensive study on the underlying gender bias in contextualized word embeddings
    Christine Basta
    Marta R. Costa-jussà
    Noe Casas
    [J]. Neural Computing and Applications, 2021, 33 : 3371 - 3384
  • [7] Iterative Adversarial Removal of Gender Bias in Pretrained Word Embeddings
    Gaci, Yacine
    Benatallah, Boualem
    Casati, Fabio
    Benabdeslem, Khalid
    [J]. 37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2022, : 829 - 836
  • [8] Bias in Word Embeddings
    Papakyriakopoulos, Orestis
    Hegelich, Simon
    Serrano, Juan Carlos Medina
    Marco, Fabienne
    [J]. FAT* '20: PROCEEDINGS OF THE 2020 CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, 2020, : 446 - 457
  • [9] Extensive study on the underlying gender bias in contextualized word embeddings
    Basta, Christine
    Costa-jussa, Marta R.
    Casas, Noe
    [J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (08): : 3371 - 3384
  • [10] Quantifying 60 Years of Gender Bias in Biomedical Research with Word Embeddings
    Rios, Anthony
    Joshi, Reenam
    Shin, Hejin
    [J]. 19TH SIGBIOMED WORKSHOP ON BIOMEDICAL LANGUAGE PROCESSING (BIONLP 2020), 2020, : 1 - 13