Improved Feature Importance Computation for Tree Models Based on the Banzhaf Value

被引:0
|
作者
Karczmarz, Adam [1 ,2 ]
Michalak, Tomasz [1 ,2 ]
Mukherjee, Anish [1 ,2 ]
Sankowski, Piotr [1 ,2 ,3 ]
Wygocki, Piotr [1 ,3 ]
机构
[1] Univ Warsaw, Inst Informat, Warsaw, Poland
[2] IDEAS NCBR, Warsaw, Poland
[3] MIM Solut, Warsaw, Poland
基金
欧洲研究理事会;
关键词
EXPLAINABLE AI; SHAPLEY VALUE; EXPLANATIONS; TRACTABILITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Shapley value - a fundamental game-theoretic solution concept - has recently become one of the main tools used to explain predictions of tree ensemble models. Another well-known game-theoretic solution concept is the Banzhaf value. Although the Banzhaf value is closely related to the Shapley value, its properties w.r.t. feature attribution have not been understood equally well. This paper shows that, for tree ensemble models, the Banzhaf value offers some crucial advantages over the Shapley value while providing similar feature attributions. In particular, we first give an optimal O(TL + n) time algorithm for computing the Banzhaf value-based attribution of a tree ensemble model's output. Here, T is the number of trees, L is the maximum number of leaves in a tree, and n is the number of features. In comparison, the state-of-the-art Shapley value-based algorithm runs in O(TLD2 + n) time, where D denotes the maximum depth of a tree in the ensemble. Next, we experimentally compare the Banzhaf and Shapley values for tree ensemble models. Both methods deliver essentially the same average importance scores for the studied datasets using two different tree ensemble models (the sklearn implementation of Decision Trees or xgboost implementation of Gradient Boosting Decision Trees). However, our results indicate that, on top of being computable faster, the Banzhaf is more numerically robust than the Shapley value.
引用
收藏
页码:969 / 979
页数:11
相关论文
共 50 条
  • [41] Subgraph-based feature fusion models for semantic similarity computation in heterogeneous knowledge graphs
    Deng, Yuanfei
    Bai, Wen
    Jiang, Yuncheng
    Tang, Yong
    Knowledge-Based Systems, 2022, 257
  • [42] y An improved relief feature selection algorithm based on Monte-Carlo tree search
    Zheng, Jianyang
    Zhu, Hexing
    Chang, Fangfang
    Liu, Yunlong
    SYSTEMS SCIENCE & CONTROL ENGINEERING, 2019, 7 (01) : 304 - 310
  • [43] Analysis of WorldView-2 band importance in tree species classification based on recursive feature elimination
    Liu, Huaipeng
    An, Huijun
    Zhang, Yongxin
    CURRENT SCIENCE, 2018, 115 (07): : 1366 - 1374
  • [44] Research on feature importance evaluation of wireless signal recognition based on decision tree algorithm in cognitive computing
    Li, Lin
    Wang, Juzhen
    COGNITIVE SYSTEMS RESEARCH, 2018, 52 : 882 - 890
  • [45] Comparison of feature importance measures as explanations for classification models
    Mirka Saarela
    Susanne Jauhiainen
    SN Applied Sciences, 2021, 3
  • [46] Comparison of feature importance measures as explanations for classification models
    Saarela, Mirka
    Jauhiainen, Susanne
    SN APPLIED SCIENCES, 2021, 3 (02):
  • [47] Computationally Efficient Feature Significance and Importance for Predictive Models
    Horel, Enguerrand
    Giesecke, Kay
    3RD ACM INTERNATIONAL CONFERENCE ON AI IN FINANCE, ICAIF 2022, 2022, : 300 - 307
  • [48] Feature Extraction of Weak Fault for Rolling Bearing Based on Improved Singular Value Decomposition
    Cui L.
    Liu Y.
    Wang X.
    Jixie Gongcheng Xuebao/Journal of Mechanical Engineering, 2022, 58 (17): : 156 - 169
  • [49] SHAP Value-Based Feature Importance Analysis for Short-Term Load Forecasting
    Yong-Geon Lee
    Jae-Young Oh
    Dongsung Kim
    Gibak Kim
    Journal of Electrical Engineering & Technology, 2023, 18 : 579 - 588
  • [50] Feature selection strategies: a comparative analysis of SHAP-value and importance-based methods
    Wang, Huanjing
    Liang, Qianxin
    Hancock, John T.
    Khoshgoftaar, Taghi M.
    JOURNAL OF BIG DATA, 2024, 11 (01)