Discovering Rule Lists with Preferred Variables

被引:0
|
作者
Papagianni, Ioanna [1 ]
van Leeuwen, Matthijs [1 ]
机构
[1] Leiden Univ, LIACS, Leiden, Netherlands
来源
ADVANCES IN INTELLIGENT DATA ANALYSIS XXI, IDA 2023 | 2023年 / 13876卷
基金
荷兰研究理事会;
关键词
Classification; Probabilistic rule lists; Minimum description length (MDL) principle; Human-guided machine learning;
D O I
10.1007/978-3-031-30047-9_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Interpretable machine learning focuses on learning models that are inherently understandable by humans. Even such interpretable models, however, must be trustworthy for domain experts to adopt them. This requires not only accurate predictions, but also reliable explanations that do not contradict a domain expert's knowledge. When considering rule-based models, for example, rules may include certain variables either due to artefacts in the data, or due to the search heuristics used. When such rules are provided as explanations, this may lead to distrust. We investigate whether human guidance could benefit interpretable machine learning when it comes to learning models that provide both accurate predictions and reliable explanations. The form of knowledge that we consider is that of preferred variables, i.e., variables that the domain expert deems important enough to be given higher priority than the other variables. We study this question for the task of multiclass classification, use probabilistic rule lists as interpretable models, and use the minimum description length (MDL) principle for model selection. We propose S-Classy, an algorithm based on beam search that learns rule lists and takes preferred variables into account. We compare S-Classy to its baseline method, i.e., without using preferred variables, and empirically demonstrate that adding preferred variables does not harm predictive performance, while it does result in the preferred variables being used in rules higher up in the learned rule lists.
引用
收藏
页码:340 / 352
页数:13
相关论文
共 50 条
  • [31] Discovering Outstanding Subgroup Lists for Numeric Targets Using MDL
    Proenca, Hugo M.
    Grunwald, Peter
    Back, Thomas
    van Leeuwen, Matthijs
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2020, PT I, 2021, 12457 : 19 - 35
  • [32] Faster compression of patterns to Rectangle Rule Lists
    Raymundo Da Silva, Ian Albuquerque
    Calinescu, Gruia
    De Graaf, Nathan
    THEORETICAL COMPUTER SCIENCE, 2020, 828 : 1 - 18
  • [33] DISCOVERING INFLUENTIAL VARIABLES: A METHOD OF PARTITIONS
    Chernoff, Herman
    Lo, Shaw-Hwa
    Zheng, Tian
    ANNALS OF APPLIED STATISTICS, 2009, 3 (04): : 1335 - 1369
  • [34] Mail lists are preferred to newsgroups as teaching tools for undergraduate biology classes
    Machart, JM
    Silverthorn, DU
    FASEB JOURNAL, 1999, 13 (04): : A360 - A360
  • [35] Medicaid preferred drug lists: Cost containment and side effects - Introduction
    Headen, Alvin E., Jr.
    PHARMACOECONOMICS, 2006, 24 : 1 - 3
  • [36] Do state medicaid preferred drug lists affect patient safety?
    Elam, L
    Murawski, MM
    Childs, S
    Vanable, JW
    PSYCHIATRIC SERVICES, 2005, 56 (08) : 1012 - 1016
  • [37] Mailing lists are preferred to newsgroups as teaching tools for undergraduate biology classes
    Machart, JM
    Silverthorn, DU
    ADVANCES IN PHYSIOLOGY EDUCATION, 2000, 23 (01) : 67 - 71
  • [38] The impact of Medicaid's preferred drug lists on physicians' prescribing behaviour
    Virabhak, Suchin
    Sohn, Wook
    APPLIED ECONOMICS, 2009, 41 (21) : 2705 - 2725
  • [39] Learning Certifiably Optimal Rule Lists for Categorical Data
    Angelino, Elaine
    Larus-Stone, Nicholas
    Alabi, Daniel
    Seltzer, Margo
    Rudin, Cynthia
    JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 18
  • [40] Algorithms for improving the dependability of firewall and filter rule lists
    Hazelhurst, S
    Attar, A
    Sinnappan, R
    DSN 2000: INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2000, : 576 - 585