Task-Free Fairness-Aware Bias Mitigation for Black-Box Deployed Models

被引:1
|
作者
Cao, Guodong [1 ]
Wang, Zhibo [1 ]
Feng, Yunhe [2 ]
Dong, Xiaowei [1 ]
Zhang, Zhifei [3 ]
Qin, Zhan [4 ]
Ren, Kui [4 ]
机构
[1] Wuhan Univ, Sch Cyber Sci & Engn, Key Lab Aerosp Informat Secur & Trusted Comp, Minist Educ, Wuhan 430072, Peoples R China
[2] Univ North Texas, Dept Comp Sci & Engn, Denton, TX 76203 USA
[3] Adobe Res, San Jose 95110, CA USA
[4] Zhejiang Univ, Sch Cyber Sci & Technol, Hangzhou 310007, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Perturbation methods; Task analysis; Data models; Training; Closed box; Training data; Computational modeling; Adversarial learning; data utility; fairness; fairness-related attributes; mutual information;
D O I
10.1109/TDSC.2023.3328663
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With AI systems widely deployed in societal applications, the fairness of these models is of increasing concern, for instance, hiring systems should recommend applicants impartially from different demographic groups, and risk assessment systems must eliminate racial inequity in the criminal justice system. Therefore, ensuring fairness in these models is crucial. In this paper, we propose Task-Free Fairness-Aware Adversarial Perturbation (TF-FAAP), a flexible approach for improving the fairness of black-box deployed models by adding perturbations on input samples that blind their fairness-related attribute information without modifying the model's parameters or structures. The proposed TF-FAAP consists of a discriminator and a generator to create universal fairness-aware perturbations for a variety of tasks. The former aims to distinguish fairnessrelated attributes, and the latter generates perturbations to make the discriminator's prediction distribution of fairness-related attributes uniform. To preserve the utility of perturbed samples, we maximize the mutual information between their representations and corresponding original samples, retaining more original samples' information. In addition, the perturbation generated by TF-FAAP has a high transferability, i.e., the perturbations learned on one dataset can also alleviate the unfairness of a model trained on a different dataset. The extensive experimental evaluation demonstrated the effectiveness and superior performance of our method.
引用
收藏
页码:3390 / 3405
页数:16
相关论文
共 7 条
  • [1] Fairness-aware Adversarial Perturbation Towards Bias Mitigation for Deployed Deep Models
    Wang, Zhibo
    Dong, Xiaowei
    Xue, Henry
    Zhang, Zhifei
    Chiu, Weifeng
    Wei, Tao
    Ren, Kui
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10369 - 10378
  • [2] Task-aware network: Mitigation of task-aware and task-free performance gap in online continual learning
    Hong, Yongwon
    Park, Sungho
    Byun, Hyeran
    [J]. NEUROCOMPUTING, 2023, 552
  • [3] Interpretable Approaches to Detect Bias in Black-Box Models
    Tan, Sarah
    [J]. PROCEEDINGS OF THE 2018 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY (AIES'18), 2018, : 382 - 383
  • [4] One-vs.-One Mitigation of Intersectional Bias: A General Method for Extending Fairness-Aware Binary Classification
    Kobayashi, Kenji
    Nakao, Yuri
    [J]. NEW TRENDS IN DISRUPTIVE TECHNOLOGIES, TECH ETHICS AND ARTIFICIAL INTELLIGENCE: THE DITTET COLLECTION, 2022, 1410 : 43 - 54
  • [5] CERTIFAI: A Common Framework to Provide Explanations and Analyse the Fairness and Robustness of Black-box Models
    Sharma, Shubham
    Henderson, Jette
    Ghosh, Joydeep
    [J]. PROCEEDINGS OF THE 3RD AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY AIES 2020, 2020, : 166 - 172
  • [6] A Context-aware Black-box Adversarial Attack for Deep Driving Maneuver Classification Models
    Sarker, Ankur
    Shen, Haiying
    Sen, Tanmoy
    [J]. 2021 18TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON SENSING, COMMUNICATION, AND NETWORKING (SECON), 2021,
  • [7] Flexibility index of black-box models with parameter uncertainty through derivative-free optimization
    Zhao, Fei
    Grossmann, Ignacio E.
    Garcia-Munoz, Salvador
    Stamatis, Stephen D.
    [J]. AICHE JOURNAL, 2021, 67 (05)