Tackling Documentation Debt: A Survey on Algorithmic Fairness Datasets

被引:8
|
作者
Fabris, Alessandro [1 ]
Messina, Stefano [1 ]
Silvello, Gianmaria [1 ]
Susto, Gian Antonio [1 ]
机构
[1] Univ Padua, Padua, Italy
来源
ACM CONFERENCE ON EQUITY AND ACCESS IN ALGORITHMS, MECHANISMS, AND OPTIMIZATION, EAAMO 2022 | 2022年
关键词
Algorithmic fairness; Data studies; Documentation debt;
D O I
10.1145/3551624.3555286
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A growing community of researchers has been investigating the equity of algorithms, advancing the understanding of risks and opportunities of automated decision-making for historically dis-advantaged populations. Progress in fair Machine Learning (ML) hinges on data, which can be appropriately used only if adequately documented. Unfortunately, the research community, as a whole, suffers from a collective data documentation debt caused by a lack of information on specific resources (opacity) and scatteredness of available information (sparsity). In this work, we survey over two hundred datasets employed in algorithmic fairness research, producing standardized and searchable documentation for each of them. Moreover we rigorously identify the three most popular fairness datasets, namely Adult, COMPAS, and German Credit, for which we compile in-depth documentation. This unifying documentation effort targets documentation sparsity and supports multiple contributions. In the first part of this work, we summarize the merits and limitations of Adult, COMPAS, and German Credit, adding to and unifying recent scholarship, calling into question their suitability as general-purpose fairness benchmarks. To overcome this limitation, we document hundreds of available alternatives, annotating their domain and the algorithmic fairness tasks they support, along with additional properties of interest for fairness practitioners and researchers, including their format, cardinality, and the sensitive attributes they encode. In the second part, we summarize this information, zooming in on the domains and tasks supported by these resources. Overall, we assemble and summarize sparse information on hundreds of datasets into a single resource, which we make available to the community, with the aim of tackling the data documentation debt.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Fairness in sovereign debt
    Barry, Christian
    Tomitova, Lydia
    SOCIAL RESEARCH, 2006, 73 (02): : 649 - 694
  • [22] Formalizing Fairness Algorithmic fairness aims to remedy issues stemming from algorithmic bias
    Krakovsky, Marina
    COMMUNICATIONS OF THE ACM, 2022, 65 (08) : 11 - 13
  • [23] Reconciling Algorithmic Fairness Criteria
    Beigang, Fabian
    PHILOSOPHY & PUBLIC AFFAIRS, 2023, 51 (02) : 166 - 190
  • [24] On statistical criteria of algorithmic fairness
    Hedden, Brian
    PHILOSOPHY & PUBLIC AFFAIRS, 2021, 49 (02) : 209 - 231
  • [25] The ideals program in algorithmic fairness
    Stewart, Rush T.
    AI & SOCIETY, 2024,
  • [26] A brief review on algorithmic fairness
    Xiaomeng Wang
    Yishi Zhang
    Ruilin Zhu
    Management System Engineering , 1 (1):
  • [27] Algorithmic fairness in social context
    Huang Y.
    Liu W.
    Gao W.
    Lu X.
    Liang X.
    Yang Z.
    Li H.
    Ma L.
    Tang S.
    BenchCouncil Transactions on Benchmarks, Standards and Evaluations, 2023, 3 (03):
  • [28] Predictive policing and algorithmic fairness
    Tzu-Wei Hung
    Chun-Ping Yen
    Synthese, 201
  • [29] On Algorithmic Fairness in Medical Practice
    Grote, Thomas
    Keeling, Geoff
    CAMBRIDGE QUARTERLY OF HEALTHCARE ETHICS, 2022, 31 (01) : 83 - 94
  • [30] An Economic Perspective on Algorithmic Fairness
    Rambachan, Ashesh
    Kleinberg, Jon
    Ludwig, Jens
    Mullainathan, Sendhil
    AEA PAPERS AND PROCEEDINGS, 2020, 110 : 91 - 95