Nuanced Metrics for Measuring Unintended Bias with Real Data for Text Classification

被引:132
|
作者
Borkan, Daniel [1 ]
Dixon, Lucas [1 ]
Sorensen, Jeffrey [1 ]
Thain, Nithum [1 ]
Vasserman, Lucy [1 ]
机构
[1] Jigsaw, Bellevue, WA USA
关键词
D O I
10.1145/3308560.3317593
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Unintended bias in Machine Learning can manifest as systemic differences in performance for different demographic groups, potentially compounding existing challenges to fairness in society at large. In this paper, we introduce a suite of threshold-agnostic metrics that provide a nuanced view of this unintended bias, by considering the various ways that a classifier's score distribution can vary across designated groups. We also introduce a large new test set of online comments with crowd-sourced annotations for identity references. We use this to show how our metrics can be used to find new and potentially subtle unintended bias in existing public models.
引用
收藏
页码:491 / 500
页数:10
相关论文
共 50 条
  • [21] Bias mitigation in text classification through cGAN and LLMs
    Kumar, Gunjan
    Singh, Jyoti Prakash
    PROCEEDINGS OF THE INDIAN NATIONAL SCIENCE ACADEMY, 2024,
  • [22] Debiasing Embeddings for Reduced Gender Bias in Text Classification
    Prost, Flavien
    Thain, Nithum
    Bolukbasi, Tolga
    GENDER BIAS IN NATURAL LANGUAGE PROCESSING (GEBNLP 2019), 2019, : 69 - 75
  • [23] A Real-Time Classification Algorithm for Multi-Velocity Measuring Data
    Liang, Xiaohu
    Zhao, Hua
    Huang, Jiagui
    PROCEEDINGS OF THE 28TH CONFERENCE OF SPACECRAFT TT&C TECHNOLOGY IN CHINA: OPENNESS, INTEGRATION AND INTELLIGENT INTERCONNECTION, 2018, 445 : 439 - 451
  • [24] Characterizing and measuring bias in sequence data
    Michael G Ross
    Carsten Russ
    Maura Costello
    Andrew Hollinger
    Niall J Lennon
    Ryan Hegarty
    Chad Nusbaum
    David B Jaffe
    Genome Biology, 14
  • [25] Characterizing and measuring bias in sequence data
    Ross, Michael G.
    Russ, Carsten
    Costello, Maura
    Hollinger, Andrew
    Lennon, Niall J.
    Hegarty, Ryan
    Nusbaum, Chad
    Jaffe, David B.
    GENOME BIOLOGY, 2013, 14 (05):
  • [26] KATG: Keyword-Bias-Aware Adversarial Text Generation for Text Classification
    Shen, Lingfeng
    Li, Shoushan
    Chen, Ying
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11294 - 11302
  • [27] An extensive empirical study of feature selection metrics for text classification
    Forman, George
    Journal of Machine Learning Research, 2003, 3 : 1289 - 1305
  • [28] Measuring Bias of Web-filtered Text Datasets and Bias Propagation Through Training
    Mansour, Youssef
    Heckel, Reinhard
    arXiv,
  • [29] Measuring the Sensitivity of Graph Metrics to Missing Data
    Zakrzewska, Anita
    Bader, David A.
    PARALLEL PROCESSING AND APPLIED MATHEMATICS (PPAM 2013), PT I, 2014, 8384 : 783 - 792
  • [30] Text Classification with Transformers and Reformers for Deep Text Data
    Soleymani, Roghayeh
    Farret, Jeremie
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN SIGNAL PROCESSING AND ARTIFICIAL INTELLIGENCE, ASPAI' 2020, 2020, : 239 - 243