Coresets for Clustering with Fairness Constraints

被引:0
|
作者
Huang, Lingxiao [1 ]
Jiang, Shaofeng H. -C. [2 ]
Vishnoi, Nisheeth K. [1 ]
机构
[1] Yale Univ, New Haven, CT 06520 USA
[2] Weizmann Inst Sci, Rehovot, Israel
基金
瑞士国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In a recent work, [20] studied the following "fair" variants of classical clustering problems such as k-means and k-median: given a set of n data points in R-d and a binary type associated to each data point, the goal is to cluster the points while ensuring that the proportion of each type in each cluster is roughly the same as its underlying proportion. Subsequent work has focused on either extending this setting to when each data point has multiple, non-disjoint sensitive types such as race and gender [7], or to address the problem that the clustering algorithms in the above work do not scale well [42, 8, 6]. The main contribution of this paper is an approach to clustering with fairness constraints that involve multiple, non-disjoint types, that is also scalable. Our approach is based on novel constructions of coresets: for the k-median objective, we construct an epsilon-coreset of size O(Gamma k(2)epsilon(-d)) where is the number of distinct collections of groups that a point may belong to, and for the k-means objective, we show how to construct an epsilon-coreset of size O(Gamma k(3)epsilon(-d-1)). The former result is the first known coreset construction for the fair clustering problem with the k-median objective, and the latter result removes the dependence on the size of the full dataset as in [42] and generalizes it to multiple, non-disjoint types. Plugging our coresets into existing algorithms for fair clustering such as [6] results in the fastest algorithms for several cases. Empirically, we assess our approach over the Adult, Bank, Diabetes and Athlete dataset, and show that the coreset sizes are much smaller than the full dataset; applying coresets indeed accelerates the running time of computing the fair clustering objective while ensuring that the resulting objective difference is small. We also achieve a speed-up to recent fair clustering algorithms [6, 7] by incorporating our coreset construction.
引用
下载
收藏
页数:12
相关论文
共 50 条
  • [41] Strong Coresets for Hard and Soft Bregman Clustering with Applications to Exponential Family Mixtures
    Lucic, Mario
    Bachem, Olivier
    Krause, Andreas
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 1 - 9
  • [42] A Comparative Study of the Some Methods Used in Constructing Coresets for Clustering Large Datasets
    Le Hoang N.
    Trang L.H.
    Dang T.K.
    SN Computer Science, 2020, 1 (4)
  • [43] More Constraints, Smaller Coresets: Constrained Matrix Approximation of Sparse Big Data
    Feldman, Dan
    Tassa, Tamir
    KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 249 - 258
  • [44] Fairness and Explanation in Clustering and Outlier Detection
    Davidson, Ian
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 4037 - 4037
  • [45] FAIRNESS IN PROSPECTIVE PAYMENT - A CLUSTERING APPROACH
    STEFOS, T
    LAVALLEE, N
    HOLDEN, F
    HEALTH SERVICES RESEARCH, 1992, 27 (02) : 239 - 261
  • [46] Fairness Constraints: Mechanisms for Fair Classification
    Zafar, Muhammad Bilal
    Valera, Isabel
    Rodriguez, Manuel Gomez
    Gummadi, Krishna P.
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 54, 2017, 54 : 962 - 970
  • [47] Candidate selections with proportional fairness constraints
    Bei, Xiaohui
    Liu, Shengxin
    Poon, Chung Keung
    Wang, Hongao
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2022, 36 (01)
  • [48] Combinatorial Sleeping Bandits With Fairness Constraints
    Li, Fengjiao
    Liu, Jia
    Ji, Bo
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2020, 7 (03): : 1799 - 1813
  • [49] Compositional Fairness Constraints for Graph Embeddings
    Bose, Avishek Joey
    Hamilton, William L.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [50] FairER: Entity Resolution With Fairness Constraints
    Efthymiou, Vasilis
    Stefanidis, Kostas
    Pitoura, Evaggelia
    Christophides, Vassilis
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3004 - 3008