Leveraging Feature Bias for Scalable Misprediction Explanation of Machine Learning Models

被引：2

作者：

Gesi, Jiri ^{[1
]}

Shen, Xinyun ^{[1
]}

Geng, Yunfan ^{[1
]}

Chen, Qihong ^{[1
]}

Ahmed, Iftekhar ^{[1
]}

机构：

[1] Univ Calif Irvine, Donald Bren Sch ICS, Irvine, CA 92717 USA

来源：

2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE | 2023年

关键词：

machine learning; data imbalance; rule induction; misprediction explanation;

D O I：

10.1109/ICSE48619.2023.00135

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Interpreting and debugging machine learning models is necessary to ensure the robustness of the machine learning models. Explaining mispredictions can help significantly in doing so. While recent works on misprediction explanation have proven promising in generating interpretable explanations for mispredictions, the state-of-the-art techniques "blindly" deduce misprediction explanation rules from all data features, which may not be scalable depending on the number of features. To alleviate this problem, we propose an efficient misprediction explanation technique named Bias Guided Misprediction Diagnoser (BGMD), which leverages two prior knowledge about data: a) data often exhibit highly-skewed feature distributions and b) trained models in many cases perform poorly on subdataset with under-represented features. Next, we propose a technique named MAPS (Mispredicted Area UPweight Sampling). MAPS increases the weights of subdataset during model retraining that belong to the group that is prone to be mispredicted because of containing under-represented features. Thus, MAPS make retrained model pay more attention to the under-represented features. Our empirical study shows that our proposed BGMD outperformed the state-of-the-art misprediction diagnoser and reduces diagnosis time by 92%. Furthermore, MAPS outperformed two state-of-the-art techniques on fixing the machine learning model's performance on mispredicted data without compromising performance on all data. All the research artifacts (i.e., tools, scripts, and data) of this study are available in the accompanying website [1].

引用

页码：1559 / 1570

页数：12

共 50 条

[1] FAIRNES AND BIAS IN MACHINE LEARNING MODELS
Langworthy, Andrew
Journal of the Institute of Telecommunications Professionals, 2023, 17 : 29 - 33
[2] Counterfactual Explanation of Machine Learning Survival Models
Kovalev, Maxim
Utkin, Lev
Coolen, Frank
Konstantinov, Andrei
INFORMATICA, 2021, 32 (04) : 817 - 847
[3] Explanation of Machine Learning Models Using Improved Shapley Additive Explanation
Nohara, Yasunobu
Matsumoto, Koutarou
Soejima, Hidehisa
Nakashima, Naoki
ACM-BCB'19: PROCEEDINGS OF THE 10TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, 2019, : 546 - 546
[4] Monotone Functions and Expert Models for Explanation of Machine Learning Models
Huber, Harlow
Kovalerchuk, Boris
2024 28TH INTERNATIONAL CONFERENCE INFORMATION VISUALISATION, IV 2024, 2024, : 227 - 235
[5] Network Intrusion Detection Leveraging Machine Learning and Feature Selection
Ali, Arshid
Shaukat, Shahtaj
Tayyab, Muhammad
Khan, Muazzam A.
Khan, Jan Sher
Arshad
Ahmad, Jawad
2020 IEEE 17TH INTERNATIONAL CONFERENCE ON SMART COMMUNITIES: IMPROVING QUALITY OF LIFE USING ICT, IOT AND AI (IEEEHONET 2020), 2020, : 49 - 53
[6] Mitigating Bias in Clinical Machine Learning Models
Julio C. Perez-Downes
Andrew S. Tseng
Keith A. McConn
Sara M. Elattar
Olayemi Sokumbi
Ronnie A. Sebro
Megan A. Allyse
Bryan J. Dangott
Rickey E. Carter
Demilade Adedinsewo
Current Treatment Options in Cardiovascular Medicine, 2024, 26 : 29 - 45
[7] Mitigating Bias in Clinical Machine Learning Models
Perez-Downes, Julio C.
Tseng, Andrew S.
McConn, Keith A.
Elattar, Sara M.
Sokumbi, Olayemi
Sebro, Ronnie A.
Allyse, Megan A.
Dangott, Bryan J.
Carter, Rickey E.
Adedinsewo, Demilade
CURRENT TREATMENT OPTIONS IN CARDIOVASCULAR MEDICINE, 2024, 26 (03) : 29 - 45
[8] Deep Discriminative Feature Learning and Feature Space Transformation for Scalable Machine Fault Diagnosis
Sreekumar, K. T.
Kumar, C. Santhosh
Ramachandran, K. I.
IEEE ACCESS, 2024, 12 : 107944 - 107958
[9] Multi-objective Feature Attribution Explanation for Explainable Machine Learning
Wang Z.
Huang C.
Li Y.
Yao X.
ACM Transactions on Evolutionary Learning and Optimization, 2024, 4 (01):
[10] Leveraging Automated Machine Learning to provide NAFLD screening diagnosis: Proposed machine learning models
Shah, Ali Haider
Bangash, Ali Haider
Fatima, Arshiya
Zehra, Saiqa
Abbas, Syed Mohammad Mehmood
Shah, Syed Mohammad Qasim
Ashraf, Muhammad
Ali, Aliya
Baloch, Adil
Khan, Ayesha Khalid
Khawaja, Hashir Fahim
Ayesha, Noor
Asghar, Saleha Yurf
Zahra, Tatheer
METABOLISM-CLINICAL AND EXPERIMENTAL, 2022, 128 : S10 - S11

← 1 2 3 4 5 →