Improved Feature Importance Computation for Tree Models Based on the Banzhaf Value

被引:0
|
作者
Karczmarz, Adam [1 ,2 ]
Michalak, Tomasz [1 ,2 ]
Mukherjee, Anish [1 ,2 ]
Sankowski, Piotr [1 ,2 ,3 ]
Wygocki, Piotr [1 ,3 ]
机构
[1] Univ Warsaw, Inst Informat, Warsaw, Poland
[2] IDEAS NCBR, Warsaw, Poland
[3] MIM Solut, Warsaw, Poland
基金
欧洲研究理事会;
关键词
EXPLAINABLE AI; SHAPLEY VALUE; EXPLANATIONS; TRACTABILITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Shapley value - a fundamental game-theoretic solution concept - has recently become one of the main tools used to explain predictions of tree ensemble models. Another well-known game-theoretic solution concept is the Banzhaf value. Although the Banzhaf value is closely related to the Shapley value, its properties w.r.t. feature attribution have not been understood equally well. This paper shows that, for tree ensemble models, the Banzhaf value offers some crucial advantages over the Shapley value while providing similar feature attributions. In particular, we first give an optimal O(TL + n) time algorithm for computing the Banzhaf value-based attribution of a tree ensemble model's output. Here, T is the number of trees, L is the maximum number of leaves in a tree, and n is the number of features. In comparison, the state-of-the-art Shapley value-based algorithm runs in O(TLD2 + n) time, where D denotes the maximum depth of a tree in the ensemble. Next, we experimentally compare the Banzhaf and Shapley values for tree ensemble models. Both methods deliver essentially the same average importance scores for the studied datasets using two different tree ensemble models (the sklearn implementation of Decision Trees or xgboost implementation of Gradient Boosting Decision Trees). However, our results indicate that, on top of being computable faster, the Banzhaf is more numerically robust than the Shapley value.
引用
收藏
页码:969 / 979
页数:11
相关论文
共 50 条
  • [21] Risk analysis sampling methods in terrorist networks based on the Banzhaf value
    Algaba, Encarnacion
    Prieto, Andrea
    Saavedra-Nieves, Alejandro
    RISK ANALYSIS, 2024, 44 (02) : 477 - 492
  • [22] Enhancing interpretability of tree-based models for downstream salinity prediction: Decomposing feature importance using the Shapley additive explanation approach
    Zhao, Guang-yao
    Ohsu, Kenji
    Saputra, Henry Kasmanhadi
    Okada, Teruhisa
    Suzuki, Jumpei
    Kuwahara, Yuji
    Fujita, Masafumi
    RESULTS IN ENGINEERING, 2024, 23
  • [23] Dominant Feature Prediction By Improved Structural Similarity Computation
    Gandhi, Indra K.
    Janarthanan, S.
    Sathish, R.
    Surendar, A.
    2020 INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN INFORMATION TECHNOLOGY (ICITIIT), 2020,
  • [24] Risk analysis sampling methods in terrorist networks based on the Banzhaf value
    Departamento de Matemática Aplicada II, Escuela Superior de Ingeniería de la Universidad de Sevilla, Sevilla, Spain
    不详
    不详
    Risk Anal., 2024, 2 (477-492):
  • [25] Visualizing the Feature Importance for Black Box Models
    Casalicchio, Giuseppe
    Molnar, Christoph
    Bischl, Bernd
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT I, 2019, 11051 : 655 - 670
  • [26] Evaluating feature importance for speaker separation models
    Prabhakar, Deeksha
    Pozuelo, Jose Ignacio
    Pluth, Daniel
    Panda, Ayush
    Gurbani, Vijay K.
    SOUTHEASTCON 2024, 2024, : 81 - 86
  • [27] SPEECH EMOTION RECOGNITION METHOD BASED ON IMPROVED DECISION TREE AND LAYERED FEATURE SELECTION
    Mao, Qirong
    Wang, Xiaojia
    Zhan, Yongzhao
    INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS, 2010, 7 (02) : 245 - 261
  • [28] Fuzzy decision tree algorithm based on feature value's class contribution level
    Bai, X. Y.
    Yang, Y. L.
    IRANIAN JOURNAL OF FUZZY SYSTEMS, 2022, 19 (04): : 73 - 88
  • [29] FIST: A Feature -Importance Sampling and Tree -Based Method for Automatic Design Flow Parameter Tuning
    Xie, Zhiyao
    Fang, Guan-Qi
    Huang, Yu-Hung
    Ren, Haoxing
    Zhang, Yanqing
    Khailany, Brucek
    Fang, Shao-Yun
    Hu, Jiang
    Chen, Yiran
    Barboza, Erick Carvajal
    2020 25TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2020, 2020, : 19 - 25
  • [30] Learning from high dimensional data based on weighted feature importance in decision tree ensembles
    Pour, Nayiri Galestian
    Shemehsavar, Soudabeh
    COMPUTATIONAL STATISTICS, 2024, 39 (01) : 313 - 342