Discovering Rule Lists with Preferred Variables

被引：0

作者：

Papagianni, Ioanna ^{[1
]}

van Leeuwen, Matthijs ^{[1
]}

机构：

[1] Leiden Univ, LIACS, Leiden, Netherlands

来源：

ADVANCES IN INTELLIGENT DATA ANALYSIS XXI, IDA 2023 | 2023年 / 13876卷

基金：

荷兰研究理事会;

关键词：

Classification; Probabilistic rule lists; Minimum description length (MDL) principle; Human-guided machine learning;

D O I：

10.1007/978-3-031-30047-9_27

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Interpretable machine learning focuses on learning models that are inherently understandable by humans. Even such interpretable models, however, must be trustworthy for domain experts to adopt them. This requires not only accurate predictions, but also reliable explanations that do not contradict a domain expert's knowledge. When considering rule-based models, for example, rules may include certain variables either due to artefacts in the data, or due to the search heuristics used. When such rules are provided as explanations, this may lead to distrust. We investigate whether human guidance could benefit interpretable machine learning when it comes to learning models that provide both accurate predictions and reliable explanations. The form of knowledge that we consider is that of preferred variables, i.e., variables that the domain expert deems important enough to be given higher priority than the other variables. We study this question for the task of multiclass classification, use probabilistic rule lists as interpretable models, and use the minimum description length (MDL) principle for model selection. We propose S-Classy, an algorithm based on beam search that learns rule lists and takes preferred variables into account. We compare S-Classy to its baseline method, i.e., without using preferred variables, and empirically demonstrate that adding preferred variables does not harm predictive performance, while it does result in the preferred variables being used in rules higher up in the learned rule lists.

引用

页码：340 / 352

页数：13

共 50 条

[31] Discovering Outstanding Subgroup Lists for Numeric Targets Using MDL
Proenca, Hugo M.
Grunwald, Peter
Back, Thomas
van Leeuwen, Matthijs
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2020, PT I, 2021, 12457 : 19 - 35
[32] Faster compression of patterns to Rectangle Rule Lists
Raymundo Da Silva, Ian Albuquerque
Calinescu, Gruia
De Graaf, Nathan
THEORETICAL COMPUTER SCIENCE, 2020, 828 : 1 - 18
[33] DISCOVERING INFLUENTIAL VARIABLES: A METHOD OF PARTITIONS
Chernoff, Herman
Lo, Shaw-Hwa
Zheng, Tian
ANNALS OF APPLIED STATISTICS, 2009, 3 (04): : 1335 - 1369
[34] Mail lists are preferred to newsgroups as teaching tools for undergraduate biology classes
Machart, JM
Silverthorn, DU
FASEB JOURNAL, 1999, 13 (04): : A360 - A360
[35] Medicaid preferred drug lists: Cost containment and side effects - Introduction
Headen, Alvin E., Jr.
PHARMACOECONOMICS, 2006, 24 : 1 - 3
[36] Do state medicaid preferred drug lists affect patient safety?
Elam, L
Murawski, MM
Childs, S
Vanable, JW
PSYCHIATRIC SERVICES, 2005, 56 (08) : 1012 - 1016
[37] Mailing lists are preferred to newsgroups as teaching tools for undergraduate biology classes
Machart, JM
Silverthorn, DU
ADVANCES IN PHYSIOLOGY EDUCATION, 2000, 23 (01) : 67 - 71
[38] The impact of Medicaid's preferred drug lists on physicians' prescribing behaviour
Virabhak, Suchin
Sohn, Wook
APPLIED ECONOMICS, 2009, 41 (21) : 2705 - 2725
[39] Learning Certifiably Optimal Rule Lists for Categorical Data
Angelino, Elaine
Larus-Stone, Nicholas
Alabi, Daniel
Seltzer, Margo
Rudin, Cynthia
JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 18
[40] Algorithms for improving the dependability of firewall and filter rule lists
Hazelhurst, S
Attar, A
Sinnappan, R
DSN 2000: INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2000, : 576 - 585

← 1 2 3 4 5 →