Making clusterings fairer by post-processing: algorithms, complexity results and experiments

被引:0
|
作者
Davidson, Ian [1 ]
Bai, Zilong [1 ]
Tran, Cindy Mylinh [1 ]
Ravi, S. S. [2 ,3 ]
机构
[1] Univ Calif Davis, Comp Sci Dept, Davis, CA 95616 USA
[2] Univ Virginia, Biocomplex Inst & Initiat, Charlottesville, VA 22904 USA
[3] SUNY Albany, Dept Comp Sci, Albany, NY 12222 USA
关键词
Clustering; Protected status; Fairness; Algorithms; Complexity;
D O I
10.1007/s10618-022-00893-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While existing fairness work typically focuses on fair-by-design algorithms, here we consider making a fairness-unaware algorithm's output fairer. Specifically, we explore the area of fairness in clustering by modifying clusterings produced by existing algorithms to make them fairer whilst retaining their quality. We formulate the minimal cluster modification for fairness (MCMF) problem, where the input is a given partitional clustering and the goal is to minimally change it so that the clustering is still of good quality but fairer. We show that for a single binary protected status variable, the problem is efficiently solvable (i.e., in the class P) by proving that the constraint matrix for an integer linear programming formulation is totally unimodular. Interestingly, we show that even for a single protected variable, the addition of simple pairwise guidance for clustering (to say ensure individual-level fairness) makes the MCMF problem computationally intractable (i.e., NP-hard). Experimental results using Twitter, Census and NYT data sets show that our methods can modify existing clusterings for data sets in excess of 100,000 instances within minutes on laptops and find clusterings that are as fair but are of higher quality than those produced by fair-by-design clustering algorithms. Finally, we explore a challenging practical problem of making a historical clustering (i.e., zipcodes clustered into California's congressional districts) fairer using a new multi-faceted benchmark data set.
引用
收藏
页码:1404 / 1440
页数:37
相关论文
共 50 条
  • [21] Reducing the computational complexity of a MAP post-processing algorithm for video sequences
    Robertson, MA
    Stevenson, RL
    1998 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL 1, 1998, : 372 - 376
  • [22] Low complexity perceptual post-processing of MPEG-4 sequences
    Jung, J
    Le Maguet, Y
    Gobert, J
    Delcorso, S
    IMAGE AND VIDEO COMMUNICATIONS AND PROCESSING 2003, PTS 1 AND 2, 2003, 5022 : 248 - 259
  • [23] Post-processing algorithms for the formation of online handwritten Gurmukhi character/akshara
    Singh, Harjeet
    Sharma, R. K.
    Malarvel, Muthukumaran
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (03) : 4799 - 4809
  • [24] Post-Processing Algorithms for Real-time Active Stereo Vision
    Choi, Seung-min
    Jeong, Jae-chan
    Hwang, Dae Hwan
    18TH IEEE INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS (ISCE 2014), 2014,
  • [25] Post-processing algorithms for the formation of online handwritten Gurmukhi character/akshara
    Singh, Harjeet
    Sharma, R.K.
    Malarvel, Muthukumaran
    Journal of Intelligent and Fuzzy Systems, 2021, 40 (03): : 4799 - 4809
  • [26] Experiments on direction finder using RBF neural network with post-processing
    Kuwahara, Y
    Matsumoto, T
    ELECTRONICS LETTERS, 2005, 41 (10) : 602 - 603
  • [27] An Analysis on Community Detection and Clustering Algorithms on the Post-Processing of Association Rules
    de Padua, Renan
    do Carmo, Lais Pessine
    Rezende, Solange Oliveira
    de Carvalho, Veronica Oliveira
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [28] Efficient implementation of video post-processing algorithms on the BOPS parallel architecture
    Petrescu, D
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 945 - 948
  • [29] A New Secure and Efficient Approach for TRNG and Its Post-Processing Algorithms
    Yakut, Selman
    Tuncer, Taner
    Ozer, Ahmet Bedri
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2020, 29 (15)
  • [30] Post-processing of BLAST results using databases of clustered sequences
    Miller, GS
    Fuchs, R
    COMPUTER APPLICATIONS IN THE BIOSCIENCES, 1997, 13 (01): : 81 - 87