Towards Understanding and Mitigating Social Biases in Language Models

被引:0
|
作者
Liang, Paul Pu [1 ]
Wu, Chiyu [1 ]
Morency, Louis-Philippe [1 ]
Salakhutdinov, Ruslan [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As machine learning methods are deployed in real-world settings such as healthcare, legal systems, and social science, it is crucial to recognize how they shape social biases and stereotypes in these sensitive decision-making processes. Among such real-world deployments are large-scale pretrained language models (LMs) that can be potentially dangerous in manifesting undesirable representational biases - harmful biases resulting from stereotyping that propagate negative generalizations involving gender, race, religion, and other social constructs. As a step towards improving the fairness of LMs, we carefully define several sources of representational biases before proposing new benchmarks and metrics to measure them. With these tools, we propose steps towards mitigating social biases during text generation. Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information for high-fidelity text generation, thereby pushing forward the performance-fairness Pareto frontier.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Diverse Yet Biased: Towards Mitigating Biases in Generative AI (Student Abstract)
    Singh, Akshit
    [J]. THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23653 - 23654
  • [32] Towards Understanding the Analysis, Models, and Future Directions of Sports Social Networks
    Bai, Zhongbo
    Bai, Xiaomei
    [J]. COMPLEXITY, 2022, 2022
  • [33] Burning biases: Mitigating cognitive biases in fire engineering
    Kinsey, Michael J.
    Kinateder, Max
    Gwynne, Steven M., V
    Hopkin, Danny
    [J]. FIRE AND MATERIALS, 2021, 45 (04) : 543 - 552
  • [34] Designing Committees for Mitigating Biases
    Feldman, Michal
    Mansour, Yishay
    Nisan, Noam
    Oren, Sigal
    Tennenholtz, Moshe
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 1942 - 1949
  • [35] Biases in Large Language Models: Origins, Inventory, and Discussion
    Navigli, Roberto
    Conia, Simone
    Ross, Bjorn
    [J]. ACM JOURNAL OF DATA AND INFORMATION QUALITY, 2023, 15 (02):
  • [36] (Ir)rationality and cognitive biases in large language models
    Macmillan-Scott, Olivia
    Musolesi, Mirco
    [J]. ROYAL SOCIETY OPEN SCIENCE, 2024, 11 (06):
  • [37] Evaluation and mitigation of cognitive biases in medical language models
    Schmidgall, Samuel
    Harris, Carl
    Essien, Ime
    Olshvang, Daniel
    Rahman, Tawsifur
    Kim, Ji Woong
    Ziaei, Rojin
    Eshraghian, Jason
    Abadir, Peter
    Chellappa, Rama
    [J]. npj Digital Medicine, 2024, 7 (01)
  • [38] Understanding by Understanding Not: Modeling Negation in Language Models
    Hosseini, Arian
    Reddy, Siva
    Bandanau, Dzmitry
    Hjelm, R. Devon
    Sordoni, Alessandro
    Courville, Aaron
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 1301 - 1312
  • [39] Understanding language understanding: Computational models of reading
    Dyer, MG
    [J]. TRENDS IN COGNITIVE SCIENCES, 2000, 4 (01) : 35 - 35
  • [40] Language understanding using hidden understanding models
    Schwartz, R
    Miller, S
    Stallard, D
    Makhoul, J
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 997 - 1000