Leveraging Feature Bias for Scalable Misprediction Explanation of Machine Learning Models

被引:2
|
作者
Gesi, Jiri [1 ]
Shen, Xinyun [1 ]
Geng, Yunfan [1 ]
Chen, Qihong [1 ]
Ahmed, Iftekhar [1 ]
机构
[1] Univ Calif Irvine, Donald Bren Sch ICS, Irvine, CA 92717 USA
关键词
machine learning; data imbalance; rule induction; misprediction explanation;
D O I
10.1109/ICSE48619.2023.00135
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Interpreting and debugging machine learning models is necessary to ensure the robustness of the machine learning models. Explaining mispredictions can help significantly in doing so. While recent works on misprediction explanation have proven promising in generating interpretable explanations for mispredictions, the state-of-the-art techniques "blindly" deduce misprediction explanation rules from all data features, which may not be scalable depending on the number of features. To alleviate this problem, we propose an efficient misprediction explanation technique named Bias Guided Misprediction Diagnoser (BGMD), which leverages two prior knowledge about data: a) data often exhibit highly-skewed feature distributions and b) trained models in many cases perform poorly on subdataset with under-represented features. Next, we propose a technique named MAPS (Mispredicted Area UPweight Sampling). MAPS increases the weights of subdataset during model retraining that belong to the group that is prone to be mispredicted because of containing under-represented features. Thus, MAPS make retrained model pay more attention to the under-represented features. Our empirical study shows that our proposed BGMD outperformed the state-of-the-art misprediction diagnoser and reduces diagnosis time by 92%. Furthermore, MAPS outperformed two state-of-the-art techniques on fixing the machine learning model's performance on mispredicted data without compromising performance on all data. All the research artifacts (i.e., tools, scripts, and data) of this study are available in the accompanying website [1].
引用
收藏
页码:1559 / 1570
页数:12
相关论文
共 50 条
  • [1] FAIRNES AND BIAS IN MACHINE LEARNING MODELS
    Langworthy, Andrew
    Journal of the Institute of Telecommunications Professionals, 2023, 17 : 29 - 33
  • [2] Counterfactual Explanation of Machine Learning Survival Models
    Kovalev, Maxim
    Utkin, Lev
    Coolen, Frank
    Konstantinov, Andrei
    INFORMATICA, 2021, 32 (04) : 817 - 847
  • [3] Explanation of Machine Learning Models Using Improved Shapley Additive Explanation
    Nohara, Yasunobu
    Matsumoto, Koutarou
    Soejima, Hidehisa
    Nakashima, Naoki
    ACM-BCB'19: PROCEEDINGS OF THE 10TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, 2019, : 546 - 546
  • [4] Monotone Functions and Expert Models for Explanation of Machine Learning Models
    Huber, Harlow
    Kovalerchuk, Boris
    2024 28TH INTERNATIONAL CONFERENCE INFORMATION VISUALISATION, IV 2024, 2024, : 227 - 235
  • [5] Network Intrusion Detection Leveraging Machine Learning and Feature Selection
    Ali, Arshid
    Shaukat, Shahtaj
    Tayyab, Muhammad
    Khan, Muazzam A.
    Khan, Jan Sher
    Arshad
    Ahmad, Jawad
    2020 IEEE 17TH INTERNATIONAL CONFERENCE ON SMART COMMUNITIES: IMPROVING QUALITY OF LIFE USING ICT, IOT AND AI (IEEEHONET 2020), 2020, : 49 - 53
  • [6] Mitigating Bias in Clinical Machine Learning Models
    Julio C. Perez-Downes
    Andrew S. Tseng
    Keith A. McConn
    Sara M. Elattar
    Olayemi Sokumbi
    Ronnie A. Sebro
    Megan A. Allyse
    Bryan J. Dangott
    Rickey E. Carter
    Demilade Adedinsewo
    Current Treatment Options in Cardiovascular Medicine, 2024, 26 : 29 - 45
  • [7] Mitigating Bias in Clinical Machine Learning Models
    Perez-Downes, Julio C.
    Tseng, Andrew S.
    McConn, Keith A.
    Elattar, Sara M.
    Sokumbi, Olayemi
    Sebro, Ronnie A.
    Allyse, Megan A.
    Dangott, Bryan J.
    Carter, Rickey E.
    Adedinsewo, Demilade
    CURRENT TREATMENT OPTIONS IN CARDIOVASCULAR MEDICINE, 2024, 26 (03) : 29 - 45
  • [8] Deep Discriminative Feature Learning and Feature Space Transformation for Scalable Machine Fault Diagnosis
    Sreekumar, K. T.
    Kumar, C. Santhosh
    Ramachandran, K. I.
    IEEE ACCESS, 2024, 12 : 107944 - 107958
  • [9] Multi-objective Feature Attribution Explanation for Explainable Machine Learning
    Wang Z.
    Huang C.
    Li Y.
    Yao X.
    ACM Transactions on Evolutionary Learning and Optimization, 2024, 4 (01):
  • [10] Leveraging Automated Machine Learning to provide NAFLD screening diagnosis: Proposed machine learning models
    Shah, Ali Haider
    Bangash, Ali Haider
    Fatima, Arshiya
    Zehra, Saiqa
    Abbas, Syed Mohammad Mehmood
    Shah, Syed Mohammad Qasim
    Ashraf, Muhammad
    Ali, Aliya
    Baloch, Adil
    Khan, Ayesha Khalid
    Khawaja, Hashir Fahim
    Ayesha, Noor
    Asghar, Saleha Yurf
    Zahra, Tatheer
    METABOLISM-CLINICAL AND EXPERIMENTAL, 2022, 128 : S10 - S11