Making clusterings fairer by post-processing: algorithms, complexity results and experiments

被引:0
|
作者
Davidson, Ian [1 ]
Bai, Zilong [1 ]
Tran, Cindy Mylinh [1 ]
Ravi, S. S. [2 ,3 ]
机构
[1] Univ Calif Davis, Comp Sci Dept, Davis, CA 95616 USA
[2] Univ Virginia, Biocomplex Inst & Initiat, Charlottesville, VA 22904 USA
[3] SUNY Albany, Dept Comp Sci, Albany, NY 12222 USA
关键词
Clustering; Protected status; Fairness; Algorithms; Complexity;
D O I
10.1007/s10618-022-00893-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While existing fairness work typically focuses on fair-by-design algorithms, here we consider making a fairness-unaware algorithm's output fairer. Specifically, we explore the area of fairness in clustering by modifying clusterings produced by existing algorithms to make them fairer whilst retaining their quality. We formulate the minimal cluster modification for fairness (MCMF) problem, where the input is a given partitional clustering and the goal is to minimally change it so that the clustering is still of good quality but fairer. We show that for a single binary protected status variable, the problem is efficiently solvable (i.e., in the class P) by proving that the constraint matrix for an integer linear programming formulation is totally unimodular. Interestingly, we show that even for a single protected variable, the addition of simple pairwise guidance for clustering (to say ensure individual-level fairness) makes the MCMF problem computationally intractable (i.e., NP-hard). Experimental results using Twitter, Census and NYT data sets show that our methods can modify existing clusterings for data sets in excess of 100,000 instances within minutes on laptops and find clusterings that are as fair but are of higher quality than those produced by fair-by-design clustering algorithms. Finally, we explore a challenging practical problem of making a historical clustering (i.e., zipcodes clustered into California's congressional districts) fairer using a new multi-faceted benchmark data set.
引用
收藏
页码:1404 / 1440
页数:37
相关论文
共 50 条
  • [1] Making clusterings fairer by post-processing: algorithms, complexity results and experiments
    Ian Davidson
    Zilong Bai
    Cindy Mylinh Tran
    S. S. Ravi
    Data Mining and Knowledge Discovery, 2023, 37 : 1404 - 1440
  • [2] Making Existing Clusterings Fairer: Algorithms, Complexity Results and Insights
    Davidson, Ian
    Ravi, S. S.
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 3733 - 3740
  • [3] Hybrid algorithms for SAR matrix compression and the impact of post-processing on SAR calculation complexity
    Orzada, Stephan
    Fiedler, Thomas M.
    Ladd, Mark E.
    MAGNETIC RESONANCE IN MEDICINE, 2024, 92 (06) : 2696 - 2706
  • [4] Hybrid algorithms for SAR matrix compression and the impact of post-processing on SAR calculation complexity
    Orzada, Stephan
    Fiedler, Thomas M.
    Ladd, Mark E.
    MAGNETIC RESONANCE IN MEDICINE, 2024, 92 (06) : 2696 - 2706
  • [5] Reducing the complexity of iterative post-processing of video
    Robertson, MA
    Stevenson, RL
    1998 MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, PROCEEDINGS, 1999, : 399 - 402
  • [6] Post-processing search results at Teltech
    Online (Wilton, Conn), 6 (91):
  • [7] On Post-processing the Results of Quantum Optimizers
    Borle, Ajinkya
    McCarter, Josh
    THEORY AND PRACTICE OF NATURAL COMPUTING, TPNC 2019, 2019, 11934 : 222 - 233
  • [8] Post-processing search results at Teltech
    Sisler, P
    Cooper, L
    ONLINE, 1996, 20 (06): : 91 - &
  • [9] Low -complexity Post-processing Method for Speech Enhancement
    Bao, Feng
    Li, Yuepeng
    Shang, Shidong
    2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [10] ON THE IMPLICATION OF LIGHT FIELD COMPRESSION ON POST-PROCESSING ALGORITHMS
    Hariharan, Harini Priyadarshini
    Herfet, Thorsten
    2019 IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING (BMSB), 2019,